Fundamentals of astrological datamining – DRAFT

We hereby propose a novel method to improve predictability of sigma events in the future by analyzing past sigma event data sets against planetary (aka astrological) datasets.

An algorithm firstly scans a datasets and identifies relevant sigma events on the past, their time stamp, geolocation and magnitude. Secondly the algorithm identifies the planetary positions for each sigma event and lastly it seeks the presence of planetary patterns (binary, ternary, quaternary vectors) underlying the presence of said sigma events in the datasets past.

If planetary vectors can be identified against each sigma deviation with relevant statistical certitude then it is possible to predict when said sigma deviations are bound to happen in the future by extrapolating planetary vectors through future ephemeris tables.

Magnitude of future sigma event deviation can then be either reduced or enhanced as desired through implementation of policies and actions. A complex system behaviour and output can then be influenced in a desired way to improve its efficiency and reliability.

The applicability of such algorithms could be in the domain of traffic pattern prediction, hospitalization rate predictions, surgery success rate and recovery time predictions and all sorts of events for which long standing datasets are available.


Astrology has been so far confined to the domain of magic and fantasy however it is indeed suspected that many behavioural aspects of human and biological activity are in sync with certain planetary positions such as the Sun (considered a planet in astrology tradition) and the Moon whilst other secondary planets are also suspected of playing some roles and influence overall behaviour of certain systems and activities.

If astrological influences are playing a role in complex system behaviours then these influences (sigma events) should be measurable against the relative position of planets within the Solar System, and specific Planetary Background Vectors (PBV)  shall be clearly identifiable within consistent datasets thus allowing the possibility to predict when, in the future, certain sigma events are bound to repeat and happen with high statistical confidence.

The substrata

A substrata is hereby defined as a complex system on which a planetary background (PB) exerts a certain measurable influence by altering the system behavior or output or any other measurable parameter of said system.

The system is then described by a dataset in which each line contains:

  1. A time stamp of the event/measure taking place within the system.
  2. A geolocation stamp
  3. At least one parameter y(1) which is the variable or behaviour of interest of said system or ideally its Standard deviation (SD) magnitude with its + or – sign.

Few examples of substrata are road networks of major metropolitan areas and the Y parameter of interest could be the traffic data or average commuting time of a given time or day.

It could be the number of hospitalizations per day in a hospital, the success rate of surgeries, or the recovery time after a surgery, potentially also market crash events, and so on and so forth.

Not all substrata are created equal or behave in the same way against similar Planetary Backgrounds (PBs). The traffic patterns and behaviours of a big metropolis can be very different from the ones of a small town in the middle of a desert for which different rules are at play and therefore different PB could be relevant.

The definition of substrata underlines the concept of Rule Sets (RS) at play within the system under examination such as people commuting to work every day and using the same mix of available technologies and infrastructures to do so. A certain rule set is only stable for so long before new rules and behaviors take hold (imagine the exponential implementation of work from home policies), and these changes induce a modification of the substrata behaviors and the relative influence of Planetary Background Vectors (PBVs).

Still it is possible to identify certain systems with long established maturity in terms of technological and social background which implies the existence of a plateau of stability during which clear rule sets are at play and therefore the influencing Planetary Background Vectors can clearly and reliably be identified.

Market data can be an example of substrata variation and evolution when considering central bank interventions in the stock markets. Here we can see how the old investment savvy rules and strategies are being superseded by an out of jail / free meal / the FED plunge protection team will buy us all up / BTFD type rules which causes the old set of behaviours and strategies to yield less predictable outcomes as opposed to previous periods of rule stability without central bank interventions.

The planetary background (PB)

Planetary backgrounds, also called Astrological backgrounds are well documented datasets to the past and have been easily extrapolated into the future within broadly available ephemeris tables converted into plain angles (0°to 360°).

A number of planetary background types can be used, the simplest one is the binary form followed by tertiary and quaternary ones in line with the astrological tradition or even more planetary combinations could be used.

A binary planetary background vector centered around 2D angle values is defined as follows:

PB(Binary) (t)=[α((1,2) ) (t),α((1,3) ) (t),…,…,α_((n-1,n) ) (t)]

A ternary planetary background vector is defined around 2 x 2D angles as follows:

PB(Ternary) (t)=[(α((1,2) ) (t),〖 β〗(1,2),3),(α((1,2) ) (t),〖 β〗(1,4),5),…,…,(α((n-2,n-1) ) (t),〖 β〗(n-2,n-1),n)]

Additional imaginary planets should be added in line not just with the astrological tradition but also in accordance to obvious substrata behaviours, the main ones being:

  1. The daytime planet (formerly the house position) being a translation of the local 0 – 24 HR into a 0°-360°angle, since the day and night cycle have a major influence on system activities.
  2. The season planet to be multiplied by the Sin of the local latitude. This is formerly the astrological sign background being -180 on the 21st of December and +180 on the 22nd of June (beware of the precession of the equinox causing these dates to drift over the centuries!).

The justification for this imaginary planet is that people and system behaviours are different in summer as opposed to winter so this is an obvious influence we cannot ignore and we therefore reduce it into a plain angle (easy to elaborate for a computer) as opposed to an abstract sign reference.

3. Lastly we should introduce a weekday planet (0°on early Monday morning and 360°on late Sunday evening) which is relevant for datasets linked to human activities but has no relevance for natural type datasets (animals or bacteria don’t modify their behaviour during weekends).

For convenience the Ephemeris tables have been converted into Radialis or plain 0 – 360 deg Angles (times 10 or 100 depending on desired angle accuracy) so one full radialis angle is 0 – 3600 (or 36000) since it is easier for computers to grasp plain numbers as opposed to converting signs into angles all the times.

An additional angle interpolation useful to avoid singularities between 360 and 0 is to furthermore convert said angles into 0 – 180 – 0 type angles (negative angles between 0 and -180 are taken in their absolute value).

Dataset analysis

The datasets under investigation (Date+time, Lat+Long, Standard deviation magnitude) is expanded to include the PBV columns as follow through interpolation of ephemeris tables as well as the variable standard deviation with its + or – sign.

Of all the planetary pair columns not all of them might be relevant depending on the datasets time span defined as the difference (in days) between the maximum data available and the minimum date available.

Imagine the Moon to Sun planetary pair column with a cycle time of 28 days.

If a datasets only spans one or two days then all the SD counts will crowd around a narrow Moon-Sun angle and this column will hardly be relevant.

Alternatively if a datasets spans many years of data observation and recording then the Moon-Sun column will be relevant to the analysis.

Also to remember that certain planetary cycles can span one hundred of years or more, meaning we don’t necessarily have enough datasets span and reliable data to meaningfully measure the influences of these kind of planets.

A planetary pair column (also hereby referred as vector) is only relevant to the analysis if its simplified cycle time is smaller (ideally many times smaller) than the datasets time span.

Next to each event line we can now calculate through the ephemeris tables all the PB vectors for said datasets.

Lastly we count the SD deviations against PBV angle span, and we plot the results against the planetary angle.

Through these tables it is possible to identify the planetary position (angles) most unfavourable to a certain event (Ro 1 and following defined as negative sigma events) as well as the most favorable angles (Tau 1 and following defined as desirable sigma events).

By looking up the ephemeris tables into the future (or simply by adding to a Tau vector (or Tau vector) date/time event its simplified cycle time), it is possible to predict when a certain undesirable (or desirable) SD deviation is bound to occur again into the future.

Closing the information loop with the substrata

Once one or more astrological correlations have been established for a certain substrata datasets it is possible to implement certain policies to alter the substrata behaviour and suppress undesired Ro sigma events and increase desired Tau ones.

For example by knowing in advance a certain day will be particularly bad for traffic then work from home alerts might be issued to reduce the Ro event magnitude for that day and possibly squash it down into positive (desirable) sigma territory.

Similarly surgeries might be scheduled at more favorable times as opposed to less favorable ones, and so on and so forth.

By implementing new policies and protocols then the substrata itself and its otherwise stable rule sets are now being modified through astrological analysis and considerations, undesirable positive SD deviation (Alphas) are being reduced whilst favorable negative ones (Betas) are being promoted.

In this case then the system datasets into the future will no longer reproduce the same SD variations of the same system datasets into the past because the substrata has been technologically influenced and the rule set governing the system behaviour has been modified through technological advent of astrological predictive techniques.

Certain systems are better suited than others to this kind of manipulations than others.

Traffic forecasting and preventive correction is typically a good candidate for this type of technology.

On the contrary, the implementation of astrological market trading rules and strategies inspired by astrological data associations will quickly spoil the rule set and the behaviour of market participants so these algorithms are not really suited for stock market predictions unless a very small number of participants are aware of such predictive algorithms and are keeping said algorithms secret (which can only work for so long before the cat is out of the bag and everybody else starts using them).

Historical notes

This kind of technology is traditionally divided into a three steps process:

The first step is called the Sauron data analysis.

It scans past datasets and identified relevant Alphas and Betas influences of a system, it judges not, it only represents the behavioural status of a system in its natural, unperturbed state and identifies relevant Ro and Tau vectors.

The second step is called the Galadriel analysis where policies and strategies are studied in order to minimize negative SD (Ro) events and promote positive SD (Tau) events.

Lastly the Frodo part revolves around practical ways to implement relevant policies identified above.

Once new policies are introduced within a system then new rules and behaviours are at play within the substrata so the data collected into the future will no longer show the typical SD variations and astrological correlations highlighted during the Sauron study.

A new substrata with a new set of rules is at play, the only way to furthermore refine and optimize the new system behaviour is to wait for it to mature and stabilize its rules and performance to a maturity stage and then gather enough system data for long enough time spans in order to start another iteration of the system optimization process highlighted above.

We know the how but what about the why?

If such a software can indeed identify astrological trends within datasets then the next question would be why are human or natural processes influenced by planetary relative positions?

Whilst the answer to this is everybody’s guess, my personal explanation revolves around the concept of “light quality”.

The Sun provides a steady quality of light and spectrum, the presence of other bodies reflecting more or less light back to Earth with different angles and at different radiation intensities (even though very feeble ones when compared to the Sun’s), well all these additional light sources from within the solar system add a flavor to the light quality we are exposed to thus adding a flavor of the moment.

This in turn averages down to a complex system behaving in a preferred way in response to the light quality as a function of planet positions and possibly even relative velocities and apparent magnitudes…


The possibility to predict complex system behaviour and output against astrological background and ephemeris tables is technologically at hand given the IT infrastructure and software resources available nowadays and capable of scanning vast datasets in small amount of times.

A draft version will be prototyped in Excel VBA and it can be used to scan any kind of dataset and verify the assumptions made on this paper.

Dedicated compiled software using better computers or even supercomputers might be needed to analyze longer and more detailed datasets against ternary and quaternary PBs because of the data process intensity of extracting so many PB columns on each event line.


Leave a Reply

Your email address will not be published. Required fields are marked *