Wolfram|Alpha Blog 2014-08-28T15:24:37Z http://blog.wolframalpha.com/feed/atom/ WordPress Tal Einav <![CDATA[A Year in a Chemistry Developer’s Shoes]]> http://blog.internal.wolframalpha.com/?p=28510 2014-08-28T15:24:37Z 2014-08-28T14:51:46Z .KeyEvent { font-family: Verdana, Arial, sans-serif; margin-right: 1px; padding: 1px 2px; background: #f6f6f6; border: 1px solid #ccc; font-size: 11px; line-height: 14px; display: inline !important; } .KeyEventDelimiter { font-family: Verdana, Arial, sans-serif; color: #5c5c5c; }

I have always seen the beauty of chemistry from a scientific standpoint: strange shapes, eye-catching patterns, giant explosions… But it was not until I came to Wolfram|Alpha that I began to appreciate just how sleek chemistry is from a programming perspective. Just a few lines of code are needed to create some of the most startling phenomena and give life to elegant theories.


In Wolfram|Alpha, some property values are stored in a database and are called non-computed properties; the remaining properties are called computed properties because they are calculated from the non-computed properties. MoleculePlot is a computed property, and so generating the above plot takes a bit more than one line of code (internally, the main function takes roughly 1,000 lines of code). Yet even if you restrict yourself to non-computed properties, you can create such diagrams with very little code. Here is a basic diagram of acetic acid from non-computed properties compared to the computed property CHBlackStructureDiagram.

Basic versus fancy diagram

Of course, this code is bare bones: it cannot display ions or isotopes, it will squish large molecules, and it does not have color. Nevertheless, for a few lines of code, it does a surprisingly good job.

To give you some perspective, in this past year I added 35,000 lines of code and modified an existing 30,000 lines of code. Much of it went into creating new functionality such as our periodic table (200 lines of code). The beauty of this periodic table lies not only in its new color scheme (which matches Mathematica 10′s color scheme, and can be seen through the command ColorData[97]), but also in the pull-down menu featuring neat properties such as boiling point, electronegativity, and the year each atom was discovered.

Periodic table

A lot of work went into upgrading our framework so that when you query for properties such as MoleculePlot or CHBlackStructureDiagram, the result comes back quickly and efficiently. Sometimes the renovations are quick and simple, such as showing as much data as possible (20 lines of code); sometimes it requires completely re-thinking how we handle objects such as chemical reactions (5,000 lines of code and a lot of thought).

While it is fantastic to find an answer to your query, it is even better to understand the procedure used to get to that answer. One of the most exciting new directions we took this year was to expand our Step-by-step functionality into chemistry.

Step-by-step Lewis structure

While computing ATP’s MoleculePlot above may be an intricate task, writing these Step-by-steps can quickly become breathtakingly complicated. As with all of Wolfram|Alpha, the steps are written using Mathematica, which means it is up to the developer to program the grammatical structure. As you might imagine, this can cause what appears to be a very basic sentence to be much more complicated under the hood. Let’s take a basic example: the second step lists the number of valence electrons in the chemical. There can be one, two, or three or more types of atoms—each of which requires different grammatical syntax. Additionally, other clauses may be stuck in the beginning, middle, or end of the sentence to describe special cases (e.g. for chemicals with a nonzero net charge), each of which requires support for punctuation, capitalization, and grammar.

So far, so simple. But to do a really great job, the steps have to vary their diction, intelligently short-circuit to the answer whenever possible, and generally sound human. We want for you to think of Wolfram|Alpha as your best friend, not only ready to answer your questions, but also to explain how she arrived at her answers.

All this would be tricky, but still straightforward, if it were not for all of the crazy exceptions that abound in chemistry! For practically every rule you can think of involving molecular bonding, there is an exception out there (as you learn from your first chemistry course); with over 40,000 molecules in our database, you can test any hypothesis and immediately find counter-examples. As the steps progress, they can branch off into an incredible number of different paths. It’s a thrilling ride to construct such an algorithm in a cogent and coherent pattern (not to mention generating robust tests for both calculation errors and grammatical inconsistencies)!

That is what made creating the Lewis structure Step-by-step such an exciting experience, especially after I discovered that the standard procedure to draw Lewis structures fails for many molecules in our database. Ultimately, the end product was a very robust Step-by-step that accounts for all ChemicalData entities. (For full disclosure: the steps do not work for 3-center 2-electron bonds or 3-center 4-electron bonds, such as in diborane or bifluoride. Therefore, the steps can handle all of our chemicals except 2.)

To give some perspective, simply drawing the Lewis dot structure diagrams took 200 lines of code; the Step-by-step took 1,000 lines of code.

User feedback was very positive for this first Step-by-step, prompting us to expand the functionality to other areas of chemistry. We decided to first tackle the most popular chemistry queries in Wolfram|Alpha, aiming primarily at important topics someone would learn in an introductory chemistry course. In addition, the Step-by-steps should fit together and complement one another. Creating the Lewis structure Step-by-step naturally suggested calculating oxidation states. We also wanted to break into chemical reactions: balancing chemical reactions begot computing reaction stoichiometry, which begot converting between units, which begot preparing solutions.

We tried to make each Step-by-step encompass as wide a field as possible. For example, the stoichiometry Step-by-step will calculate the theoretical yield of products if only given amounts for reactants (e.g. 2 grams glucose + oxygen -> water + carbon dioxide), will calculate the percentage yield of products if given amounts of reactants and products (e.g. 0.2 mol CH4 + O2 -> 7 mL H2O + CO2), and will calculate the amount of reactants needed if only given amounts of products (e.g. C6H6 + NO2+ -> 0.02 mols C6H5NO2 + H+). We plan to continue expanding to other areas, and user feedback always helps us prioritize what to tackle next!

For users who want to get into the driver’s seat, Wolfram|Alpha’s tighter integration into Mathematica 10 grants mouth-watering access to our chemistry data (along with element, isotope, and thermodynamic data). We exposed over 50 new properties and gave the remaining properties a serious tune-up, including upgrading to the new Version 10 formats for Quantity, Entity, and Association!

Both new and experienced users should check out the ChemicalData documentation page where all the chemistry functionality is explained along with numerous examples on how they can be used. Once you are ready and eager to access the data, we recommend using the Ctrl+= interface to discover entities and properties. For example, typing “benzene” into this interface will yield the right-hand side of:


Alternatively, you can peruse our entire list of chemicals. Here’s a random sample of five entities.

Random sample of five entities

Each entity has a list of properties that you can query. The full list is rather large, so let’s just focus in on properties that begin with the letter M.

Properties that begin with letter M

You can now use ChemicalData to perform an entity-property query and to access the data for each of these properties

ChemicalData entity-property query

You can use free-form input to discover a specific entity-property query directly by typing it into the Ctrl+= interface. For example, typing in “benzene molar mass” yields a ChemicalData expression that you can evaluate to find the molar mass.

Benzene molar mass

Because this interface links directly to Wolfram|Alpha, it grants you significant flexibility on how you enter your input. For example, you can alter the order of the input (“molar mass benzene”), insert filler words (“what is the molar mass of benzene”), use generic phrases (“benzene mass”), misspell words (“benzne masss”), and so much more. Wolfram|Alpha has got your back!

With that introduction, you have all the tools to play around in various fields of chemistry. For example, you can do a systematic study of your favorite molecule—let’s pick cyclobutane. One of the important features of cyclobutane is its bonding structure, so let’s consider the chemicals whose graphs are isomorphic to cyclobutane’s along with their boiling points and molar masses.


Here we find both the general trend that boiling point increases with molar mass as well as the (expected?) exceptions—the chemicals in the first and third slots. This invites a series of questions: Do fluorocarbons like octafluorocyclobutane always have a significantly lower boiling point than their hydrocarbon counterparts? Is fluorine special, or would the same phenomenon happen with other halogens? Could we make a program that can predict a molecule’s boiling point based on its structure diagram? There are countless directions to go!

The combination of Mathematica and Wolfram|Alpha transforms chemistry into an incredibly fun subject to explore. What amazing things can you create once you have the right tools for the job? We have worked to make chemistry not just intellectually stimulating and visually appealing, but a truly sexy aspect of the Wolfram Language. As we continue expanding Wolfram|Alpha’s capabilities, we welcome your recommendations. Share your thoughts on how to make chemistry even more beautiful, suggest what future directions you would like us to delve into, or simply shout out to the world that you love Wolfram|Alpha!

Michael Trott http://www.wolframalpha.com <![CDATA[Which Is Closer: Local Beer or Local Whiskey?]]> http://blog.internal.wolframalpha.com/?p=28078 2014-08-21T21:51:44Z 2014-08-19T15:35:31Z In today’s blog post, we will use some of the new features of the Wolfram Language, such as language processing, geometric regions, map-making capabilities, and deploying forms to analyze and visualize the distribution of beer breweries and whiskey distilleries in the US. In particular, we want to answer the core question: for which fraction of the US is the nearest brewery further away than the nearest distillery?

Disclaimer: you may read, carry out, and modify inputs in this blog post independent of your age. Hands-on taste tests might require a certain minimal legal age (check your countries’ and states’ laws).

We start by importing two images from Wikipedia to set the theme; later we will use them on maps.

Image of beer vs. image of whiskey

We will restrict our analysis to the lower 48 states. We get the polygon of the US and its latitude/longitude boundaries for repeated use in the following.

Polygon of the US and its latitude/longitude boundaries

And we define a function that tests if a point lies within the continental US.

We define a function that tests if a point lies within the continental US

We start with beer. Let’s have a look at the yearly US beer production and consumption over the last few decades.

Yearly US beer production and consumption

This production puts the US in second place, after China, on the world list of beer producers. (More details about the international beer economy can be found here.)

This production puts the US in second place, after China, on the world list of beer producers

And here is a quick look at the worldwide per capita beer consumption.

Worldwide per capita beer consumption

The consumption of the leading 30 countries in natural units, kegs of beer:

Consumption of the leading 30 countries in natural units, kegs of beer

Some countries prefer drinking wine (see here for a detailed discussion of this subject). The following graphic shows (on a logarithmic base 2 scale) the ratio of beer consumption to wine consumption. Negative logarithmic ratios mean a higher wine consumption compared to beer consumption. (See the American Association of Wine Economists’ working paper no. 79 for a detailed study of the correlation between wine and beer consumption with GDP, mean temperature, etc.)

Ratio of beer consumption to wine consumption

We start with the beer breweries. To plot and analyze, we need a list of breweries. The Wolfram Knowledgebase contains data about a lot of companies, organizations, food, geographic regions, and global beer production and consumption. But breweries are not yet part of the Wolfram Knowledgebase. With some web searching, we can more or less straightforwardly find a web page with a listing of all US breweries. We then import the data about 2600+ beer breweries in the US as a structured dataset. This is an all-time high over the last 125 years. (For a complete list of historical breweries in the US, you can become a member of the American Breweriana Association and download their full database, which also covers long-closed breweries.)

Beer breweries

Here are a few randomly selected entries from the dataset.

Random selections from dataset

We see that for each brewery, we have their name, the city where they are located, their website URL, and their phone number (the BC, BP, and similar abbreviations stand for if and what you can eat with your beer, which is irrelevant for today’s blog post).

Next, we process the data, remove breweries no longer in operation, and extract brewery names, addresses, and ZIP codes.

Processing the data

We now have data for 2600+ breweries.

Data for over 2600+ breweries

For a geographic analysis, we resolve the ZIP codes to actual lat/long coordinates using the EntityValue function.

Resolve ZIP codes to actual lat/long coordinates using EntityValue function

Unfortunately, not all ZIP codes were resolved to actual latitudes and longitudes. These are the ones where we did not successfully find a geographic location.

Unsuccessful geographic location resolving

Why did we not find coordinates for these ZIP codes? As frequently happens with non-programmatically curated data, there are mistakes in the data, and so we will have to clean it up. The easiest way would be to simply ignore these breweries, but we can do better. These are the actual entries of the breweries with missing coordinates.

Actual entries of the breweries with missing coordinates
Actual entries of the breweries with missing coordinates

A quick check at the USPS website shows that, for instance, the first of the above ZIP codes, 54704, is not a ZIP code that the USPS recognizes and/or delivers mail to.

So no wonder the Wolfram Knowledgebase was not able to find a coordinate for this “ZIP code”. Fortunately, we can make progress in fixing the incorrect ZIP codes programmatically. Assume the nonexistent ZIP code was just a typo. Let’s find a ZIP code in Madison, WI that has a small string distance to the ZIP code 54704.

Find a ZIP code in Madison, WI that has a small string distance to the ZIP code 54704

The ZIP code 53704 is in string (and Euclidean) distance as near as possible to 54704.

ZIP code 53704 is in string (and Euclidean) distance as near as possible to 54704

And taking a quick look at the company’s website confirms that 53704 is the correct ZIP code. This observation, together with the programmatic ZIP code lookups, allows us to define a function to programmatically correct the ZIP codes in case they are just simple typos.

Define function to programmatically correct the ZIP codes in case they are just simple typos

For instance, for Black Market Brewing in Temecula, we find that the corrected ZIP code is 92590.

Corrected ZIP code example

So, to clean the data, we perform some string replacements to get a dataset that has ZIP codes that exist.

Cleaning data to get dataset with existing ZIP codes

We now acquire coordinates again for the corrected dataset.

We now acquire coordinates again for the corrected dataset

Now we have coordinates for all breweries.

Coordinates for all breweries

And all ZIP codes are now associated with a geographic position. (At least when I wrote the blog post; because the used website gets regularly updated, at a later point in time new typos could have occurred and the fixDataRules would have to be updated appropriately.)

All ZIP codes are now associated with a geographic position

Now that we have coordinates, we can make a map with all the breweries indicated.

Map with all the breweries indicated

Let’s pause for a moment and think about what goes into beer. According to the Reinheitsgebot from November 1487, it’s just malted barley, hops, and water (plus yeast). The detailed composition of water has an important influence on a beer’s taste. The water composition in turn relates to hydrogeology. (See this paper for a detailed discussion of the relation.) Carrying out a quick web search lets us find a site showing important natural springs in the US. We import the coordinates of the springs and plot them together with the breweries.

Import the coordinates of the springs and plot them together with the breweries

We redraw the last map, but this time add the natural springs in blue. Without trying to quantify the correlation here between breweries and springs, a visual correlation is clearly visible.

Visual correlation is clearly visible

We quickly calculate a plot of the distribution of the distances of a brewery to the nearest spring from the list springPositions.

Calculate a plot of the distribution of the distances of a brewery to the nearest spring

And if we connect each brewery to the nearest spring, we obtain the following graphic.

Connect each brewery to the nearest spring

We can also have a quick look at which regions of the US can use their local barley and hops, as the Wolfram Knowledgebase knows in which US states these two plants can be grown.

US regions that use local barley and hops

(For the importance of spring water for whiskey, see this paper.) Most important for a beer’s taste is the hops (see this paper and this paper for more details). The Alpha symbol-acids of hops give the beer its bitter taste. The most commonly occurring Alpha symbol-acid in hops is humulone. (To refresh your chemistry knowledge, see the Step-by-step derivation for where to place the dots in the below diagram.)


But let’s not be sidetracked by chemistry and instead focus in this blog post on geographic aspects relating to beer.

Historically, a relationship has existed between beer production and the church (in the form of monasteries; see “A Comprehensive History of Beer Brewing” for details). Today we don’t see a correlation (other than through population densities) between religion and beer production. Just to confirm, let’s draw a map of major churches in the US together with the breweries. At the website of the Hartford Institute, we find a listing of major churches. (Yes, it would have been fun to really draw all 110,000+ churches of the US on a map, but the blog team did not want me to spend $80–$100 to buy a US church database and support spam-encouraging companies, e.g from here or here.)

Beers vs. churches

Back to the breweries. Instead of a cloud of points of individual breweries we can construct a continuous brewery probability field and plot it. This more prominently shows the hotspots of breweries in the US. To do so, we calculate a smooth kernel distribution for the brewery density in projected coordinates. We use the Sheather–Jones bandwidth estimator, which relieves us from needing to specify an explicit bandwidth. Determining the optimal bandwidth is a nontrivial calculation and will take a few minutes.

Sheather–Jones bandwidth estimator

We plot the resulting distribution and map the resulting image onto a map of the US. Blue denotes a low brewery density and red a high one. Denver, Oregon, and Southern California clearly stand out as local hotspots.

We plot the resulting distribution and map the resulting image onto a map of the US

The black points on top of the brewery density map are the actual brewery locations.

Brewery density map

Using the brewery density as an elevation, we can plot the beer topography of the US. Previously unknown (beer-density) mountain ranges and peaks become visible in topographically flat areas.

Beer topography of the US

The next graphic shows a map where we accumulate the brewery counts by latitude and longitude. Similar to the classic wheat belt, we see two beer belts running East to West and two beer belts running North to South.

Brewery longitude-latitude

Let’s determine the elevations of the breweries and make a histogram to see whether there is more interest in a locally grown beer at low or high elevations.

Elevations of breweries

It seems that elevations between 500 and 1500 ft are most popular for places making a fresh cold barley pop (with an additional peak at around 5000 ft caused by the many breweries in the Denver region).

Brewer elevation histogram

For further use, we summarize all relevant information about the breweries in breweryData.

Summarize relevant information about breweries

We define some functions to find the nearest brewery and the distance to the nearest brewery.

Define functions to find the nearest brewery

Here are the nearest breweries from the Wolfram headquarters In Champaign, IL.

Breweries close to Champaign, IL

And here is a plot of the distances from Champaign to all breweries, sorted by size. After accounting for the breweries in the immediate neighborhood of Champaign, for the first nearly 1000 miles we see a nearly linear increase in the number of breweries with a slope of approximately 2.1 breweries/mile.

Plot of the distances from Champaign to all breweries

Now that we know where to find a freshly brewed beer, let’s switch focus and concentrate on whiskey distilleries. Again, after some web searching we find a web page with a listing of all distilleries in the continental US. Again, we read in the data, this time in unstructured form, extract the distillery and cities named, and carry out some data cleanup as we go.

Whiskey data extraction

This time, we have the name of the distillery, their website, and the city as available data. Here are some example distilleries.

Example distilleries

A quick check shows that we did a proper job in cleaning the data and now have locations for all distilleries.

Example distilleries
Example distilleries

We now have a list of about 500 distilleries.

502 distilleries

We retrieve the elevations of the cities with distilleries.

Elevations of cities with distilleries

The average elevation of a distillery does not deviate much from the one for breweries.

Little deviation between elevation of distilleries and breweries

We summarize all relevant information about the distilleries in distilleryData.

We summarize all relevant information about the distilleries in distilleryData

Define functions to find the nearest brewery and the distance to the nearest brewery.

Define functions to find the nearest brewery and the distance to the nearest brewery

We now use the function nearestDistilleries to locate the nearest distillery and make a map of the bearings to take to go to the nearest distillery.


Let’s come back to breweries. What’s the distribution by state? Here are the states with the most breweries.

Brewery distribution by state

If we normalize by state population, we get the following ranking.

Normalizing for state population

And which city has the most breweries? We accumulate the ZIP codes by city. Here are the top dozen cities by brewery count.

Cities with most breweries

And here is a more visual representation of the top 25 brewery cities. We show a beer glass over the top brewery cities whose size is proportional to the number of breweries.

Visual representation of top 25 brewery cities
Visual representation of top 25 brewery cities

Oregon isn’t a very large state, and it includes beer capital Portland, so let’s plan a trip to visit all breweries. To minimize driving, we calculate the shortest tour that visits all of the state’s breweries. (All distances are along geodesics, not driving distances on roads.)

Calculate the shortest tour that visits all of Oregon's breweries

A visit to all Oregon breweries will be a 1,720-mile drive.

A visit to all Oregon breweries will be a 1720-mile drive

And here is a sketch of the shortest trips that hit all breweries for each of the lower 48 states.

Sketch of the shortest trips that hit all breweries for each of the lower 48 states

Let’s quickly make a website that lets you plan a short beer tour through your state (and maybe some neighboring states). The function makeShortestTourDisplay calculates and visualizes the shortest path. For comparison, the length of a tour with the breweries chosen in random order is also shown. The shortest path often allows us to save a factor 5…15 in driving distances.

Beer tour
Beer tour
Shortest tour display
Drive responsibly on brewery tours!

We deploy the function makeShortestTourDisplay to let you easily plan your favorite beer state tours.

Deploy the function makeShortestTourDisplay
Making beer tour plan
Making beer tour plan

And if the reader has time to take a year off work, a visit to all breweries in the continental US is just a 41,000-mile trip.

Brief 41,000-mile trip

The collected caps from such a trip could make beautiful artwork! Here is a graphic showing one of the possible tours. The color along the tour changes continuously with the spectrum, and we start in the Northeast.

Possible tour

On average, we would have to drive just 15 miles between two breweries.

Fifteen-minute drive between two breweries

Here is a distribution of the distances.

Distribution of the distances

Such a trip covering all breweries would involve driving nearly 300 miles up and down.

Driving distance of 300 miles up and down

Here is a plot of the height profile along the trip.

Height profile along the trip

We compare the all-brewery trip with the all-distillery trip, which is still about 21,000 miles.

All brewery vs. all distillery

To calculate the distribution function for the average distance from a US citizen to the nearest brewery and similar facts, we build a list of coordinates and the population of all ZIP code regions. We will only consider the part of the population that is older than 21 years. We retrieve this data for the ~30,000 ZIP codes.

List of coordinates and the population of all ZIP code regions

We exclude the ZIP codes that are in Alaska, Hawaii, and Guam and concentrate on the 48 states of the continental US.

Exclude ZIP codes in Alaska, Hawaii, and Guam

We will take into account adults from the ~29,000 populated ZIP code areas with a non-vanishing number of adults totaling about 214 million people.

Adults from the ~29,000 populated ZIP code areas with a non-vanishing number of adults totaling about 214 million people

Now that we have a function to calculate the distance to the nearest brewery at hand and a list of positions and populations for all ZIP codes, let’s do some elementary statistics using this data.

Elementary statistics using this data

Here is a plot of the distribution of distances from all ZIP codes to the nearest brewery.

Distribution of distances from all ZIP codes to nearest brewery

More than 32 million Americans have a local brewery within their own ZIP code region.

Over 32 million Americans have local brewery within their ZIP code region

While ~15% of the above-drinking-age population is located in the same ZIP code as a brewery, this does not imply zero distance to the next brewery. As a rough estimation, we will model the distribution within a ZIP code as the distance between two random points. In the spirit of the famous spherical cow, the shape of a ZIP code we will approximate as a disk. Thus, we need the size distribution of the ZIP code areas.

The average distance between two randomly selected points from a disk is approximately the radius of the disk itself.

Average distance between two randomly selected points from a disk is approximately the radius of the disk itself

Within our crude model, we take the areas of the cities and calculate the radius of the corresponding disk. We could do a much more refined Monte Carlo model using the actual polygons of the ZIP code regions, but for the qualitative results that we are interested in, this would be overkill.

Calculate areas of cities and radius of corresponding disk

Now, with a more refined treatment of the same ZIP code data, on average, for a US citizen in the lower 48 states, the nearest brewery is still only about 13.5 miles away.

Nearest brewery 13.5 miles away for most US citizens

And, modulo a scale factor, the distribution of distances to the nearest brewery is the same as the distribution above.

Same distribution as above

Let’s redo the same calculation for the distilleries.

Same calculation for distilleries

The weighted average distance to the nearest distillery is about 30 miles for the above-drinking-age customers of the lower 48 states.

Weighted average distance to the nearest distillery is about 30 miles for the above-drinking-age customers of the lower 48 states

And for about 1 in 7 Americans the nearest distillery is closer then the nearest brewery.

~16% of Americans live closer to distillery than brewery

We define a function that, for a given geographic position, calculates the distance to the nearest brewery and the nearest distillery.

Calculate the distance to nearest brewery and nearest distillery

E.g. if you are at Mt. Rushmore, the nearest brewery is just 18 miles away, while the nearest distillery is nearly 160 miles away.

Mt. Rushmore example

For some visualizations to be made below, for a dense grid of points in the US, find the distance to the nearest brewery and the nearest distillery. It will take 20 minutes to calculate these 320,000 distances, so we have time to visit the nearest espresso machine in the meantime.

Find distance to nearest brewery and nearest distillery

So, how far away can the nearest brewery be from an adult US citizen (within the lower 48 states)? We calculate the maximal distance to a brewery.

Calculate the maximal distance to a brewery

We find that the city furthest away from a freshly brewed beer is Ely in Nevada–about 170 miles away.

Furthest away from freshly brewed beer is Ely

And here is the maximal distance to a distillery. From Redford, Texas it is about 335 miles to the nearest distillery.

Maximal distance to a distillery

Of the inhabitants of these two cities, the people from Ely have “only” a 188-mile distance to a distillery and the people from Redford are 54 miles from the next brewery.

Ely vs. Redford

After having found the external distance cities, the next natural question is for the city that has the maximal distance to either a brewery or a distillery.

Maximal distance to brewery or distillery

Let’s have a look at the situation in the middle of Kansas. The ~100 adult citizens of Manter, Kansas are quite far away from a local alcoholic drink.

Alcohol situation in Manter, Kansas

And here is a detailed look at the breweries/distilleries situation near Manter.

Breweries/distilleries situation near Manter

Now that we have the detailed distances for a dense grid of points over the continental US, let’s visualize this data. First, we make plots showing the distance, where blue indicates small distances and red dangerously large distances.

Visualizing alcohol data

Using these distance plots properly projected into the US yields a more natural-looking image.

Natural-looking image of distance plots

And here is the corresponding image for distilleries. Note the clearly visible great Distillery Ridge mountain range between Eastern US distilleries and Western US distilleries.

Corresponding image for distilleries

For completeness, here is the maximum of either the distance to the nearest brewery or the distance to the nearest distillery.

Maximum of either the distance to the nearest brewery or the distance to the nearest distillery

And here is the equivalent 3D image with the distance to the next brewery or distillery shown as vertical elevation. We also use a typical elevation plot coloring scheme for this graphic.

Distance to the next brewery or distillery shown as vertical elevation

We can also zoom into the Big Dry Badlands mountain range to the East of Denver as an equal-distance-to-freshly-made-alcoholic-drink contour plot. The regions with a distance larger than 100 miles to the nearest brewery or distillery are emphasized with a purple background.

Zoom into the Big Dry Badlands mountain range to the East of Denver as an equal-distance-to-freshly-made-alcoholic-drink contour plot
Zoom into the Big Dry Badlands mountain range to the East of Denver as a equal-distance-to-freshly-made-alcoholic-drink contour plot

Or, more explicit graphically, we can use the beer and whiskey images from earlier to show the regions that are closer to a brewery than to a distillery and vice versa. In the first image, the grayed-out regions are the ones where the nearest distillery is at a smaller distance than the nearest brewery. The second image shows regions where the nearest brewery is at a smaller distance than the nearest distillery in gray.

Use the beer and whiskey images from earlier to show the regions that are closer to a brewery than to a distillery

There are many more bells and whistles that we could add to these types of graphics. For instance, we could add some interactive elements to the above graphic that show details when hovering over the graphic.

Add interactive elements
Adding interactive features

Earlier in this blog post, we constructed an infographic about beer production and consumption in the US over the last few decades. After having analyzed distillery locations, a natural question is what role whiskey plays among all spirits. This paper analyzes the average alcohol content of spirits consumed in the US over a 50+ year time span at the level of US states. If you have a subscription, you can easily import the main findings of the study, which is Table 1.

Imported findings from study

Here is a snippet of the data. The average alcohol content of the spirits consumed decreased substantially from 1950 to 2000, mainly due to a decrease in whiskey consumption.

Here is a graphical representation of the data from 1950 to 2000.

Graphical representation of the data from 1950 to 2000
Graphical representation of the data from 1950 to 2000

So far we have concentrated on beer- and whiskey-related issues on a geographic scale. Let’s finish with some stats and infographics on the kinds of beer produced in the breweries mapped above. Again, after some web searching, we find a page that lists the many types of beer, 160+ different styles to be precise. (See also the Handbook of Brewing and the “Brewers Association 2014 Beer Style Guidelines” for a detailed discussion of beer styles.)

Stats and infographics on kinds of beer produced

We again import the data. The web page is perfectly maintained and checked, so this time we do not have to carry out any data cleanup.

Importing data

How much beer one can drink depends on the alcohol content. Here is the distribution of beer styles by alcohol content. Hover over the graph to see the beer styles in the individual bins.

Distribution of beer styles by alcohol content

Beer colors are defined on a special scale called Standard Reference Method (SRM). Here is a translation of the SRM values to RGB colors.

Translation of the SRM values to RGB colors

How do beer colors correlate with alcohol content and bitterness? The following graphic shows the parameter ranges for the 160+ beer styles. Again, hover over the graph to see the beer style categories highlighted.

Parameter ranges for the over 160 beer styles

In an interactive 3D version, we can easily restrict the color values.

3D version

After visualizing breweries in the US and analyzing the alcohol content of beer types, what about the distribution of the actual brewed beers within the US? After doing some web searching again, we can find a website that lists breweries and the beers they brew.

So, let’s read in the beer data from the site for 2,600 breweries. We start with preparing a list of the relevant web pages.

Preparing a list of the relevant web pages

Next, we prepare for processing the individual pages.

Prepare for processing the individual pages

As this will take a while, we can display the breweries, their beers, and a link to the brewery website to entertain us in the meantime. Here is an example of what to display while waiting.

Breweries, beers, and a link to the brewery website

Now we process the data for all breweries. Time for another cup of coffee. To have some entertainment while processing the beers of 2,000+ breweries, we again use Monitor to display the last-analyzed brewery and their beers. We also show a clickable link to the brewery website so that the reader can choose a beer of their liking.

Process data for breweries

Here is a typical data entry. We have the brewery name, its location, and, if available, the actual beers, their classification as Lager, Bock, Doppelbock, Stout, etc., together with their alcohol content.

Typical data entry

Here is the distribution of the number of different beers made by the breweries. To get a feeling, we will quickly import some example images.

Distribution of the number of different beers made by the breweries

Concretely, more than 24,400 US-made beers were listed in the just-imported web pages.

24,400 US-made beers were listed in the just-imported web pages

Accumulating all beers gives the following cumulative distribution of the alcohol content.

Accumulating all beers gives cumulative distribution of the alcohol content

On average, a US beer has an alcohol content (by volume) of (6.7Plus-minus2.1)%.

US beer alcohol content average of 6.7%

If we tally up by beer type, we get the following distribution of types. India Pale Ale is the winner, followed by American Pale Ale.

Distribution of beer types
Distribution of beer types

Now let’s put the places where a Hefeweizen is freshly brewed on a map.

Where to find freshest Hefeweizen brew

And here are some healthy breakfast beers with oatmeal or coffee (in the name).

Breakfast beers

For the carnivorous beer drinkers, there are plenty of options. Here are samples of beers with various mammals and fish in their name. (Using Select[#&@@@Flatten[Last/@Take[brewerBeerDataUS,All],2],
, we could get a complete list of all animal beers.)

Beers with various mammals and fish in their name
Beers with various mammals and fish in their name
Beers with various mammals and fish in their name

What about the names of the individual beers? Here is the distribution of their (string) lengths. Hover over the columns to see the actual names.

Beer name string lengths

Presume you plan a day trip up to 125 miles in radius (meaning not longer than about a two-hour drive in each direction). How many different beers and beer types would you encounter as a function of your starting location? Building a fast lookup for the breweries up to distance d, you can calculate these numbers for a dense set of points across the US and visualize the resulting data geographically. (For simplicity, we assume a spherical Earth for this calculation.)

Calculate for a dense set of points across US

In the best-case scenario, you can try about 80 different beer types realized through more than 2000 different individual beers within a 125-mile radius.

80 different beer types realized through more than 2,000 different individual beers within a 125-mile radius
80 different beer types realized through more than 2000 different individual beers within a 125-mile radius

After so much work doing statistics on breweries, beer colors, beer names, etc., let’s have some real fun: let’s make some fun visualizations using the beers and logos of breweries.

Many of the brewery homepages show images of the beers that they make. Let’s import some of these and make a delicious beer (bottle, can, glass) collage.

Beer bottle collage
Beer bottle collage

We continue by making a reduced version of brewerBeerDataUS that contains the breweries and URLs by state.


Fortunately, many of the brewery websites have their logo on the front page, and in many cases the image has logo in the image filename. This means a possible automated way to get images of logos is to read in the front page of the web presences of the breweries.

Find logo via web presence of brewery

We will restrict our logo searches to logos that are not too wide or too tall, because we want to use them inside graphics.

Restrict logo search

We also define a small list of special-case lookups, especially for states that have only a few breweries.

Define a small list of special-case lookups

Now we are ready to carry out an automated search for brewery logos. To get some variety into the vizualizations, we try to get about six different logos per state.

Automated search for brewery logos

After removing duplicates (from breweries that brew in more than one state), we have about 240 images at hand.

247 images

A simple collage of brewery logos does not look too interesting.

Simple brewery collage

So instead, let’s make some random and also symmetrized kaleidoscopic images of brewery logos. To do so, we will map the brewery logos into the polygons of a radial-symmetric arrangement of polygons. The function kaleidoscopePolygons generates such sets of polygons.

Random and symmetrized kaleidoscopic images of brewery logos

The next result shows two example sets of polygons with threefold and fourfold symmetry.

Two example sets of polygons with threefold and fourfold symmetry

And here are two random beer logo kaleidoscopes.

Two random beer logo kaleidoscopes

Here are four symmetric beer logo kaleidoscopes of different rotational symmetry orders.

Four symmetric beer logo kaleidoscopes of different rotational symmetry orders

Or we could add brewery stickers onto the faces of the Wolfram|Alpha Spikey, the rhombic hexecontahedron. As the faces of a rhombic hexecontahedron are quadrilaterals, the images don’t have to be distorted very much.

Add brewery stickers onto the faces of the Wolfram|Alpha spikey

Let’s end with randomly selecting a brewery logo for each state and mapping it onto the polygons of the state.

Randomly selecting a brewery logo for each state and mapping it onto the polygons of the state

The next graphic shows some randomly selected logos from states in the Northeast.

Randomly selected logos from Northeast states

And we finish with a brewery logo mapped onto each state of the continental US.

Brewery logo mapped onto each state of the continental US

We will now end and leave the analysis of wineries for a future blog post. For a more detailed account of the distribution of breweries throughout the US over the last few hundred years, and a variety of other beer-related geographical topics, I recommend reading the recent book The Geography of Beer, especially the chapter “Mapping United States Breweries 1612 to 2011″. For deciding if a bottle of beer, a glass of wine, or a shot of whiskey is right for you, follow this flowchart.

Download this post as a Computable Document Format (CDF) file.

Stephen Wolfram http:// <![CDATA[Computational Knowledge and the Future of Pure Mathematics]]> http://blog.internal.wolframalpha.com/?p=28416 2014-08-14T18:08:48Z 2014-08-12T15:24:17Z Every four years for more than a century there’s been an International Congress of Mathematicians (ICM) held somewhere in the world. In 1900 it was where David Hilbert announced his famous collection of math problems—and it’s remained the top single periodic gathering for the world’s research mathematicians.

This year the ICM is in Seoul, and I’m going to it today. I went to the ICM once before—in Kyoto in 1990. Mathematica was only two years old then, and mathematicians were just getting used to it. Plenty already used it extensively—but at the ICM there were also quite a few who said, “I do pure mathematics. How can Mathematica possibly help me?”

pure mathematics putting the pieces together

Twenty-four years later, the vast majority of the world’s pure mathematicians do in fact use Mathematica in one way or another. But there’s nevertheless a substantial core of pure mathematics that still gets done pretty much the same way it’s been done for centuries—by hand, on paper.

Ever since the 1990 ICM I’ve been wondering how one could successfully inject technology into this. And I’m excited to say that I think I’ve recently begun to figure it out. There are plenty of details that I don’t yet know. And to make what I’m imagining real will require the support and involvement of a substantial set of the world’s pure mathematicians. But if it’s done, I think the results will be spectacular—and will surely change the face of pure mathematics at least as much as Mathematica (and for a younger generation, Wolfram|Alpha) have changed the face of calculational mathematics, and potentially usher in a new golden age for pure mathematics.

Workflow of pure math The whole story is quite complicated. But for me one important starting point is the difference in the typical workflows for calculational mathematics and pure mathematics. Calculational mathematics tends to involve setting up calculational questions, and then working through them to get results—just like in typical interactive Mathematica sessions. But pure mathematics tends to involve taking mathematical objects, results or structures, coming up with statements about them, and then giving proofs to show why those statements are true.

How can we usefully insert technology into this workflow? Here’s one simple way. Think about Wolfram|Alpha. If you enter 2+2, Wolfram|Alpha—like Mathematica—will compute 4. But if you enter new york—or, for that matter, 2.3363636 or cos(x) log(x)—there’s no single “answer” for it to compute. And instead what it does is to generate a report that gives you a whole sequence of “interesting facts” about what you entered.

partial wolfram alpha output for cos x log x

And this kind of thing fits right into the workflow for pure mathematics. You enter some mathematical object, result or structure, and then the system tries to tell you interesting things about it—just like some extremely wise mathematical colleague might. You can guide the system if you want to, by telling it what kinds of things you want to know about, or even by giving it a candidate statement that might be true. But the workflow is always the Wolfram|Alpha-like “what can you tell me about that?” rather than the Mathematica-like “what’s the answer to that?”

Wolfram|Alpha already does quite a lot of this kind of thing with mathematical objects. Enter a number, or a mathematical expression, or a graph, or a probability distribution, or whatever, and Wolfram|Alpha will use often-quite-sophisticated methods to try to tell you a collection of interesting things about it.

wolfram alpha output collection of interesting things

But to really be useful in pure mathematics, there’s something else that’s needed. In addition to being able to deal with concrete mathematical objects, one also has to be able to deal with abstract mathematical structures.

Countless pure mathematical papers start with things like, “Let F be a field with such-and-such properties.” We need to be able to enter something like this—then have our system automatically give us interesting facts and theorems about F, in effect creating a whole automatically generated paper that tells the story of F.

So what would be involved in creating a system to do this? Is it even possible? There are several different components, all quite difficult and time consuming to build. But based on my experiences with Mathematica, Wolfram|Alpha, and A New Kind of Science, I am quite certain that with the right leadership and enough effort, all of them can in fact be built.

A key part is to have a precise symbolic description of mathematical concepts and constructs. Lots of this now already exists—after more than a quarter century of work—in Mathematica. Because built right into the Wolfram Language are very general ways to represent geometries, or equations, or stochastic processes or quantifiers. But what’s not built in are representations of pure mathematical concepts like bijections or abstract semigroups or pullbacks.

Mathematica Pura Over the years, plenty of mathematicians have implemented specific cases. But could we systematically extend the Wolfram Language to cover the whole range of pure mathematics—and make a kind of “Mathematica Pura”? The answer is unquestionably yes. It’ll be fascinating to do, but it’ll take lots of difficult language design.

I’ve been doing language design now for 35 years—and it’s the hardest intellectual activity I know. It requires a curious mixture of clear thinking, aesthetics and pragmatic judgement. And it involves always seeking the deepest possible understanding, and trying to do the broadest unification—to come up in the end with the cleanest and “most obvious” primitives to represent things.

Today the main way pure mathematics is described—say in papers—is through a mixture of mathematical notation and natural language, together with a few diagrams. And in designing a precise symbolic language for pure mathematics, this has to be the starting point.

One might think that somehow mathematical notation would already have solved the whole problem. But there’s actually only a quite small set of constructs and concepts that can be represented with any degree of standardization in mathematical notation—and indeed many of these are already in the Wolfram Language.

So how should one go further? The first step is to understand what the appropriate primitives are. The whole Wolfram Language today has about 5000 built-in functions—together with many millions of built-in standardized entities. My guess is that to broadly support pure mathematics there would need to be something less than a thousand other well-designed functions that in effect define frameworks—together with maybe a few tens of thousands of new entities or their analogs.

wolfram language function and entity categories

Take something like function spaces. Maybe there’ll be a FunctionSpace function to represent a function space. Then there’ll be various operations on function spaces, like PushForward or MetrizableQ. Then there’ll be lots of named function spaces, like “CInfinity”, with various kinds of parameterizations.

Underneath, everything’s just a symbolic expression. But in the Wolfram Language there end up being three immediate ways to input things, all of which are critical to having a convenient and readable language. The first is to use short notations—like + or forall—as in standard mathematical notation. The second is to use carefully chosen function names—like MatrixRank or Simplex. And the third is to use free-form natural language—like trefoil knot or aleph0.

One wants to have short notations for some of the most common structural or connective elements. But one needs the right number: not too few, like in LISP, nor too many, like in APL. Then one wants to have function names made of ordinary words, arranged so that if one’s given something written in the language one can effectively just “read the words” to know at least roughly what’s going on in it.

Computers & humans But in the modern Wolfram Language world there’s also free-form natural language. And the crucial point is that by using this, one can leverage all the various convenient—but sloppy—notations that actual mathematicians use and find familiar. In the right context, one can enter “L2? for Lebesgue Square Integrable—and the natural language system will take care of disambiguating it and inserting the canonical symbolic underlying form.

Ultimately every named construct or concept in pure mathematics needs to have a place in our symbolic language. Most of the 13,000+ entries in MathWorld. Material from the 5600 or so entries in the MSC2010 classification scheme. All the many things that mathematicians in any given field would readily recognize when told their names.

But, OK, so let’s say we manage to create a precise symbolic language that captures the concepts and constructs of pure mathematics. What can we do with it?

One thing is to use it “Wolfram|Alpha style”: you give free-form input, which is then interpreted into the language, and then computations are done, and a report is generated.

But there’s something else too. If we have a sufficiently well-designed symbolic language, it’ll be useful not only to computers but also to humans. In fact, if it’s good enough, people should prefer to write out their math in this language than in their current standard mixture of natural language and mathematical notation.

When I write programs in the Wolfram Language, I pretty much think directly in the language. I’m not coming up with a description in English of what I’m trying to do and then translating it into the Wolfram Language. I’m forming my thoughts from the beginning in the Wolfram Language—and making use of its structure to help me define those thoughts.

If we can develop a sufficiently good symbolic language for pure mathematics, then it’ll provide something for pure mathematicians to think in too. And the great thing is that if you can describe what you’re thinking in a precise symbolic language, there’s never any ambiguity about what anything means: there’s a precise definition that you can just go to the documentation for the language to find.

And once pure math is represented in a precise symbolic language, it becomes in effect something on which computation can be done. Proofs can be generated or checked. Searches for theorems can be done. Connections can automatically be made. Chains of prerequisites can automatically be found.

But, OK, so let’s say we have the raw computational substrate we need for pure mathematics. How can we use this to actually implement a Wolfram|Alpha-like workflow where we enter descriptions of things, and then in effect automatically get mathematical wisdom about them?

There are two seemingly different directions one can go. The first is to imagine abstractly enumerating possible theorems about what has been entered, and then using heuristics to decide which of them are interesting. The second is to start from computable versions of the millions of theorems that have actually been published in the literature of mathematics, and then figure out how to connect these to whatever has been entered.

Each of these directions in effect reflects a slightly different view of what doing mathematics is about. And there’s quite a bit to say about each direction.

Math by enumeration Let’s start with theorem enumeration. In the simplest case, one can imagine starting from an axiom system and then just enumerating true theorems based on that system. There are two basic ways to do this. The first is to enumerate possible statements, and then to use (implicit or explicit) theorem-proving technology to try to determine which of them are true. And the second is to enumerate possible proofs, in effect treeing out possible ways the axioms can be applied to get theorems.

It’s easy to do either of these things for something like Boolean algebra. And the result is that one gets a sequence of true theorems. But if a human looks at them, many of them seem trivial or uninteresting. So then the question is how to know which of the possible theorems should actually be considered “interesting enough” to be included in a report that’s generated.

My first assumption was that there would be no automatic approach to this—and that “interestingness” would inevitably depend on the historical development of the relevant area of mathematics. But when I was working on A New Kind of Science, I did a simple experiment for the case of Boolean algebra.

NKS Boolean algebra theorems p 817

There are 14 theorems of Boolean algebra that are usually considered “interesting enough” to be given names in textbooks. I took all possible theorems and listed them in order of complexity (number of variables, number of operators, etc). And the surprising thing I found is that the set of named theorems corresponds almost exactly to the set of theorems that can’t be proved just from ones that precede them in the list. In other words, the theorems which have been given names are in a sense exactly the minimal statements of new information about Boolean algebra.

Boolean algebra is of course a very simple case. And in the kind of enumeration I just described, once one’s got the theorems corresponding to all the axioms, one would conclude that there aren’t any more ?interesting theorems” to find—which for many mathematical theories would be quite silly. But I think this example is a good indication of how one can start to use automated heuristics to figure out which theorems are “worth reporting on”, and which are, for example, just “uninteresting embellishments”.

Interestingness Of course, the general problem of ranking “what’s interesting” comes up all over Wolfram|Alpha. In mathematical examples, one’s asking what region is interesting to plot?, “what alternate forms are interesting?” and so on. When one enters a single number, one’s also asking “what closed forms are interesting enough to show?”—and to know this, one for example has to invent rankings for all sorts of mathematical objects (how complicated should one consider pi relative to log(343) relative to Khinchin’s Constant, and so on?).

possible closed forms for 137.036 in wolfram alpha

So in principle one can imagine having a system that takes input and generates ?interesting” theorems about it. Notice that while in a standard Mathematica-like calculational workflow, one would be taking input and “computing an answer” from it, here one’s just “finding interesting things to say about it”.

The character of the input is different too. In the calculational case, one’s typically dealing with an operation to be performed. In the Wolfram|Alpha-like pure mathematical case, one’s typically just giving a description of something. In some cases that description will be explicit. A specific number. A particular equation. A specific graph. But more often it will be implicit. It will be a set of constraints. One will say (to use the example from above), “Let F be a field,” and then one will give constraints that the field must satisfy.

In a sense an axiom system is a way of giving constraints too: it doesn’t say that such-and-such an operator “is Nand”; it just says that the operator must satisfy certain constraints. And even for something like standard Peano arithmetic, we know from Gödel’s Theorem that we can never ultimately resolve the constraints–we can never nail down that the thing we denote by “+” in the axioms is the particular operation of ordinary integer addition. Of course, we can still prove plenty of theorems about “+”, and those are what we choose from for our report.

So given a particular input, we can imagine representing it as a set of constraints in our precise symbolic language. Then we would generate theorems based on these constraints, and heuristically pick the “most interesting” of them.

One day I’m sure doing this will be an important part of pure mathematical work. But as of now it will seem quite alien to most pure mathematicians—because they are not used to “disembodied theorems”; they are used to theorems that occur in papers, written by actual mathematicians.

And this brings us to the second approach to the automatic generation of “mathematical wisdom”: start from the historical corpus of actual mathematical papers, and then make connections to whatever specific input is given. So one is able to say for example, ?The following theorem from paper X applies in such-and-such a way to the input you have given”, and so on.

Curating the math corpus So how big is the historical corpus of mathematics? There’ve probably been about 3 million mathematical papers published altogether—or about 100 million pages, growing at a rate of about 2 million pages per year. And in all of these papers, perhaps 5 million distinct theorems have been formally stated.

So what can be done with these? First, of course, there’s simple search and retrieval. Often the words in the papers will make for better search targets than the more notational material in the actual theorems. But with the kind of linguistic-understanding technology for math that we have in Wolfram|Alpha, it should not be too difficult to build what’s needed to do good statistical retrieval on the corpus of mathematical papers.

But can one go further? One might think about tagging the source documents to improve retrieval. But my guess is that most kinds of static tagging won’t be worth the trouble; just as one’s seen for the web in general, it’ll be much easier and better to make the search system more sophisticated and content-aware than to add tags document by document.

What would unquestionably be worthwhile, however, is to put the theorems into a genuine computable form: to actually take theorems from papers and rewrite them in a precise symbolic language.

Will it be possible to do this automatically? Eventually I suspect large parts of it will. Today we can take small fragments of theorems from papers and use the linguistic understanding system built for Wolfram|Alpha to turn them into pieces of Wolfram Language code. But it should gradually be possible to extend this to larger fragments—and eventually get to the point where it takes, at most, modest human effort to convert a typical theorem to precise symbolic form.

So let’s imagine we curate all the theorems from the literature of mathematics, and get them in computable form. What would we do then? We could certainly build a Wolfram|Alpha-like system that would be quite spectacular—and very useful in practice for doing lots of pure mathematics.

Undecidability bites But there will inevitably be some limitations—resulting in fact from features of mathematics itself. For example, it won’t necessarily be easy to tell what theorem might apply to what, or even what theorems might be equivalent. Ultimately these are classic theoretically undecidable problems—and I suspect that they will often actually be difficult in practical cases too. And at the very least, all of them involve the same kind of basic process as automated theorem proving.

And what this suggests is a kind of combination of the two basic approaches we’ve discussed—where in effect one takes the complete corpus of published mathematics, and views it as defining a giant 5-million-axiom formal system, and then follows the kind of automated theorem-enumeration procedure we discussed to find “interesting things to say”.

Math: science or art? So, OK, let’s say we build a wonderful system along these lines. Is it actually solving a core problem in doing pure mathematics, or is it missing the point?

I think it depends on what one sees the nature of the pure mathematical enterprise as being. Is it science, or is it art? If it’s science, then being able to make more theorems faster is surely good. But if it’s art, that’s really not the point. If doing pure mathematics is like creating a painting, automation is going to be largely counterproductive—because the core of the activity is in a sense a form of human expression.

This is not unrelated to the role of proof. To some mathematicians, what matters is just the theorem: knowing what’s true. The proof is essentially backup to ensure one isn’t making a mistake. But to other mathematicians, proof is a core part of the content of the mathematics. For them, it’s the story that brings mathematical concepts to light, and communicates them.

So what happens when we generate a proof automatically? I had an interesting example about 15 years ago, when I was working on A New Kind of Science, and ended up finding the simplest axiom system for Boolean algebra (just the single axiom ((psmall circleq)small circler)small circle(psmall circle((psmall circler)small circlep))==r, as it turned out). I used equational-logic automated theorem-proving (now built into FullSimplify) to prove the correctness of the axiom system. And I printed the proof that I generated in the book:

NKS proof of Boolean axiom system pp 811 812

It has 343 steps, and in ordinary-size type would be perhaps 40 pages long. And to me as a human, it’s completely incomprehensible. One might have thought it would help that the theorem prover broke the proof into 81 lemmas. But try as I might, I couldn’t really find a way to turn this automated proof into something I or other people could understand. It’s nice that the proof exists, but the actual proof itself doesn’t tell me anything.

Proof as story And the problem, I think, is that there’s no “conceptual story” around the elements of the proof. Even if the lemmas are chosen “structurally” as good “waypoints” in the proof, there are no cognitive connections—and no history—around these lemmas. They’re just disembodied, and apparently disconnected, facts.

So how can we do better? If we generate lots of similar proofs, then maybe we’ll start seeing similar lemmas a lot, and through being familiar they will seem more meaningful and comprehensible. And there are probably some visualizations that could help us quickly get a good picture of the overall structure of the proof. And of course, if we manage to curate all known theorems in the mathematics literature, then we can potentially connect automatically generated lemmas to those theorems.

It’s not immediately clear how often that will possible—and indeed in existing examples of computer-assisted proofs, like for the Four Color Theorem, the Kepler Conjecture, or the simplest universal Turing machine, my impression is that the often-computer-generated lemmas that appear rarely correspond to known theorems from the literature.

But despite all this, I know at least one example showing that with enough effort, one can generate proofs that tell stories that people can understand: the step-by-step solutions system in Wolfram|Alpha Pro. Millions of times a day students and others compute things like integrals with Wolfram|Alpha—then ask to see the steps.

wolfram alpha step by step indefinite integral

It’s notable that actually computing the integral is much easier than figuring out good steps to show; in fact, it takes some fairly elaborate algorithms and heuristics to generate steps that successfully communicate to a human how the integral can be done. But the example of step-by-step in Wolfram|Alpha suggests that it’s at least conceivable that with enough effort, it would be possible to generate proofs that are readable as “stories”—perhaps even selected to be as short and simple as possible (“proofs from The Book”, as Erd?s would say).

Of course, while these kinds of automated methods may eventually be good at communicating the details of something like a proof, they won’t realistically be able to communicate—or even identify—overarching ideas and motivations. Needless to say, present-day pure mathematics papers are often quite deficient in communicating these too. Because in an effort to ensure rigor and precision, many papers tend to be written in a very formal way that cannot successfully represent the underlying ideas and motivations in the mind of the author—with the result that some of the most important ideas in mathematics are transmitted through an essentially oral tradition.

It would certainly help the progress of pure mathematics if there were better ways to communicate its content. And perhaps having a precise symbolic language for pure mathematics would make it easier to express concretely some of those important points that are currently left unwritten. But one thing is for sure: having such a language would make it possible to take a theorem from anywhere, and—like with a typical Wolfram Language code fragment—immediately be able to plug it in anywhere else, and use it.

But back to the question of whether automation in pure mathematics can ultimately make sense. I consider it fairly clear that a Wolfram|Alpha-like “pure math assistant” would be useful to human mathematicians. I also consider it fairly clear that having a good, precise, symbolic language—a kind of Mathematica Pura that’s a well-designed follow-on to standard mathematical notation—would be immensely helpful in formulating, checking and communicating math.

Automated discovery But what about a computer just “going off and doing math by itself”? Obviously the computer can enumerate theorems, and even use heuristics to select ones that might be considered interesting to human mathematicians. And if we curate the literature of mathematics, we can do extensive “empirical metamathematics” and start trying to recognize theorems with particular characteristics, perhaps by applying graph-theoretic criteria on the network of theorems to see what counts as “surprising” or a “powerful” theorem. There’s also nothing particularly difficult—like in WolframTones—about having the computer apply aesthetic criteria deduced from studying human choices.

But I think the real question is whether the computer can build up new conceptual frameworks and structures—in effect new mathematical theories. Certainly some theorems found by enumeration will be surprising and indicative of something fundamentally new. And it will surely be impressive when a computer can take a large collection of theorems—whether generated or from the literature—and discover correlations among them that indicate some new unifying principle. But I would expect that in time the computer will be able not only to identify new structures, but also name them, and start building stories about them. Of course, it is for humans to decide whether they care about where the computer is going, but the basic character of what it does will, I suspect, be largely indistinguishable from many forms of human pure mathematics.

All of this is still fairly far in the future, but there’s already a great way to discover math-like things today—that’s not practiced nearly as much as it should be: experimental mathematics. The term has slightly different meanings to different people. For me it’s about going out and studying what mathematical systems do by running experiments on them. And so, for example, if we want to find out about some class of cellular automata, or nonlinear PDEs, or number sequences, or whatever, we just enumerate possible cases and then run them and see what they do.

There’s a lot to discover like this. And certainly it’s a rich way to generate observations and hypotheses that can be explored using the traditional methodologies of pure mathematics. But the real thrust of what can be done does not fit into what pure mathematicians typically think of as math. It’s about exploring the “flora and fauna”—and principles—of the universe of possible systems, not about building up math-like structures that can be studied and explained using theorems and proofs. Which is why—to quote the title of my book—I think one should best consider this a new kind of science, rather than something connected to existing mathematics.

In discussing experimental mathematics and A New Kind of Science, it’s worth mentioning that in some sense it?s surprising that pure mathematics is doable at all—because if one just starts asking absolutely arbitrary questions about mathematical systems, many of them will end up being undecidable.

This is particularly obvious when one’s out in the computational universe of possible programs, but it?s also true for programs that represent typical mathematical systems. So why isn’t undecidability more of a problem for typical pure mathematics? The answer is that pure mathematics implicitly tends to select what it studies so as to avoid undecidability. In a sense this seems to be a reflection of history: pure mathematics follows what it has historically been successful in doing, and in that way ends up navigating around undecidability—and producing the millions of theorems that make up the corpus of existing pure mathematics.

OK, so those are some issues and directions. But where are we at in practice in bringing computational knowledge to pure mathematics?

Getting it done There’s certainly a long history of related efforts. The works of Peano and Whitehead and Russell from a century ago. Hilbert’s program. The development of set theory and category theory. And by the 1960s, the first computer systems—such as Automath—for representing proof structures. Then from the 1970s, systems like Mizar that attempted to provide practical computer frameworks for presenting proofs. And in recent times, increasingly popular “proof assistants” based on systems like Coq and HOL.

One feature of essentially all these efforts is that they were conceived as defining a kind of “low-level language” for mathematics. Like most of today’s computer languages, they include a modest number of primitives, then imagine that essentially any actual content must be built externally, by individual users or in libraries.

But the new idea in the Wolfram Language is to have a knowledge-based language, in which as much actual knowledge as possible is carefully designed into the language itself. And I think that just like in general computing, the idea of a knowledge-based language is going to be crucial for injecting computation into pure mathematics in the most effective and broadly useful way.

So what’s involved in creating our Mathematica Pura—an extension to the Wolfram Language that builds in the actual structure and content of pure math? At the lowest level, the Wolfram Language deals with arbitrary symbolic expressions, which can represent absolutely anything. But then the language uses these expressions for many specific purposes. For example, it can use a symbol x to represent an algebraic variable. And given this, it has many functions for handling symbolic expressions—interpreted as mathematical or algebraic expressions—and doing various forms of math with them.

The emphasis of the math in Mathematica and the Wolfram Language today is on practical, calculational, math. And by now it certainly covers essentially all the math that has survived from the 19th century and before. But what about more recent math? Historically, math itself went through a transition about a century ago. Just around the time modernism swept through areas like the arts, math had its own version: it started to consider systems that emerged purely from its own formalism, without regard for obvious connection to the outside world.

And this is the kind of math—through developments like Bourbaki and beyond—that came to dominate pure mathematics in the 20th century. And inevitably, a lot of this math is about defining abstract structures to study. In simple cases, it seems like one might represent these structures using some hierarchy of types. But the types need to be parametrized, and quite quickly one ends up with a whole algebra or calculus of types—and it’s just as well that in the Wolfram Language one can use general symbolic expressions, with arbitrary heads, rather than just simple type descriptions.

As I mentioned early in this blog post, it’s going to take all sorts of new built-in functions to capture the frameworks needed to represent modern pure mathematics—together with lots of entity-like objects. And it’ll certainly take years of careful design to make a broad system for pure mathematics that’s really clean and usable. But there’s nothing fundamentally difficult about having symbolic constructs that represent differentiability or moduli spaces or whatever. It’s just language design, like designing ways to represent 3D images or remote computation processes or unique external entity references.

So what about curating theorems from the literature? Through Wolfram|Alpha and the Wolfram Language, not to mention for example the Wolfram Functions Site and the Wolfram Connected Devices Project, we’ve now had plenty of experience at the process of curation, and in making potentially complex things computable.

The eCF example But to get a concrete sense of what’s involved in curating mathematical theorems, we did a pilot project over the last couple of years through the Wolfram Foundation, supported by the Sloan Foundation. For this project we picked a very specific and well-defined area of mathematics: research on continued fractions. Continued fractions have been studied continually since antiquity, but were at their most popular between about 1780 and 1910. In all there are around 7000 books and papers about them, running to about 150,000 pages.

We chose about 2000 documents, then set about extracting theorems and other mathematical information from them. The result was about 600 theorems, 1500 basic formulas, and about 10,000 derived formulas. The formulas were directly in computable form—and were in effect immediately able to join the 300,000+ on the Wolfram Functions Site, that are all now included in Wolfram|Alpha. But with the theorems, our first step was just to treat them as entities themselves, with properties such as where they were first published, who discovered them, etc. And even at this level, we were able to insert some nice functionality into Wolfram|Alpha.

wolfram alpha output Worpitzky theorem

But we also started trying to actually encode the content of the theorems in computable form. It took introducing some new constructs like LebesgueMeasure, ConvergenceSet and LyapunovExponent. But there was no fundamental problem in creating precise symbolic representations of the theorems. And just from these representations, it became possible to do computations like this in Wolfram|Alpha:

continued fraction theorems in wolfram alpha continued fraction results involving quadratic irrationals stern stoltz theorem query in wolfram alpha

An interesting feature of the continued fraction project (dubbed “eCF”) was how the process of curation actually led to the discovery of some new mathematics. For having done curation on 50+ papers about the Rogers–Ramanujan continued fraction, it became clear that there were missing cases that could now be computed. And the result was the filling of a gap left by Ramanujan for 100 years.

Ramanujan missing cases now computable

There’s always a tradeoff between curating knowledge and creating it afresh. And so, for example, in the Wolfram Functions Site, there was a core of relations between functions that came from reference books and the literature. But it was vastly more efficient to generate other relations than to scour the literature to find them.

wolfram function site wolfram alpha generate relations between functions

But if the goal is curation, then what would it take to curate the complete literature of mathematics? In the eCF project, it took about 3 hours of mathematician time to encode each theorem in computable form. But all this work was done by hand, and in a larger-scale project, I am certain that an increasing fraction of it could be done automatically, not least using extensions of our Wolfram|Alpha natural language understanding system.

Of course, there are all sorts of practical issues. Newer papers are predominantly in TeX, so it’s not too difficult to pull out theorems with all their mathematical notation. But older papers need to be scanned, which requires math OCR, which has yet to be properly developed.

Then there are issues like whether theorems stated in papers are actually valid. And even whether theorems that were considered valid, say, 100 years ago are still considered valid today. For example, for continued fractions, there are lots of pre-1950 theorems that were successfully proved in their time, but which ignore branch cuts, and so wouldn’t be considered correct today.

And in the end of course it requires lots of actual, skilled mathematicians to guide the curation process, and to encode theorems. But in a sense this kind of mobilization of mathematicians is not completely unfamiliar; it’s something like what was needed when Zentralblatt was started in 1931, or Mathematical Reviews in 1941. (As a curious footnote, the founding editor of both these publications was Otto Neugebauer, who worked just down the hall from me at the Institute for Advanced Study in the early 1980s, but who I had no idea was involved in anything other than decoding Babylonian mathematics until I was doing research for this blog post.)

When it comes to actually constructing a system for encoding pure mathematics, there’s an interesting example: Theorema, started by Bruno Buchberger in 1995, and recently updated to version 2. Theorema is written in the Wolfram Language, and provides both a document-based environment for representing mathematical statements and proofs, and actual computation capabilities for automated theorem proving and so on.

theorema proof

No doubt it’ll be an element of what’s ultimately built. But the whole project is necessarily quite large—perhaps the world’s first example of “big math”. So can the project get done in the world today? A crucial part is that we now have the technical capability to design the language and build the infrastructure that’s needed. But beyond that, the project also needs a strong commitment from the world’s mathematics community—as well as lots of contributions from individual mathematicians from every possible field. And realistically it’s not a project that can be justified on commercial grounds—so the likely $100+ million that it will need will have to come from non-commercial sources.

But it’s a great and important project—that promises to be pivotal for pure mathematics. In almost every field there are golden ages when dramatic progress is made. And more often than not, such golden ages are initiated by new methodology and the arrival of new technology. And this is exactly what I think will happen in pure mathematics. If we can mobilize the effort to curate known mathematics and build the system to use and generate computational knowledge around it, then we will not only succeed in preserving and spreading the great heritage of pure mathematics, but we will also thrust pure mathematics into a period of dramatic growth.

Large projects like this rely on strong leadership. And I stand ready to do my part, and to contribute the core technology that is needed. Now to move this forward, what it takes is commitment from the worldwide mathematics community. We have the opportunity to make the second decade of the 21st century really count in the multi-millennium history of pure mathematics. Let’s actually make it happen!

To comment, please visit the original post at the Stephen Wolfram Blog »

Jeff Bryant http://www.wolframalpha.com <![CDATA[Rosetta—First Mission to Orbit and Land on a Comet]]> http://blog.internal.wolframalpha.com/?p=28247 2014-08-06T19:56:01Z 2014-08-04T15:01:24Z We recently posted a blog entry celebrating the anniversary of the Apollo 11 landing on the Moon. Now, just a couple weeks later, we are preparing for another first: the European Space Agency’s attempt to orbit and then land on a comet. The Rosetta spacecraft was launched in 2004 with the ultimate goal of orbiting and landing on comet 67P/Churyumov–Gerasimenko. Since the launch, Rosetta has already flown by asteroid Steins, in 2008, and asteroid 21 Lutetia, in 2010.

NASA and the European Space Agency (ESA) have a long history of sending probes to other solar system bodies that then orbit those bodies. The bodies have usually been nice, well-behaved, and spherical, making orbital calculations a fairly standard thing. But, as Rosetta recently started to approach comet 67P, we began to get our first views of this alien world. And it is far from spherical.

Far from spherical comet 67P


The comet’s nucleus can be vaguely described like a flat platform with a spheroidal rock attached to one end, with the entire system rotating in a complex manner. How do you orbit such an object? The exact dynamics to model are beyond the scope of this post, but we can make use of Wolfram|Alpha and the Wolfram Language, including new features in Mathematica 10, to do some thought experiments. Can we generate a simple model of the comet, and then try to simulate the probe’s approach and orbit of this irregular body? Keep in mind that comet nuclei are relatively small bodies with small masses. This means that there’s not a lot of gravity to hold you in orbit or to help you land. Initial estimates suggest this comet has an escape velocity of about .5 m/s. If you move any faster than this, you will fly off into space and never come back, so the approach must be very slow, on the order of 10 cm/s, or risk never being captured.

Wolfram|Alpha recently added support for physical systems that we can use to model our comet. First, we need the shape. Let’s model the comet as a massive and flat cuboid with a sphere sitting on one end. We can access the new Wolfram|Alpha functionality using EntityValue in the Wolfram Language.


Now we can render the resulting model.

Render the model

The next step is to obtain the gravitational potential of this system. Let’s assume that 2/3 of the total mass is in the cuboid and 1/3 of the total mass is in the sphere. Let’s also assume that the densities of the sphere and the cuboid are the same.

Gravitational potential of system

Now, we can plot a slice of the potential using ContourPlot. A preliminary estimate of the comet’s total mass is also included.


We can also visualize the gravitational potential in 3D using ContourPlot3D. Here, we’ve chosen to look at a specific iso-potential near the surface of the model.

Visualize gravitational potential

The force of gravity can be found by differentiating the gravitational potential.

Differentiating gravitational potential

One of the more difficult problems is that in order to orbit and land on the comet, you have to approach the comet and then slow down so that the gravity of the comet can capture you, the orbiter. This typically requires a rocket to be fired to slow the orbiter down. So here we define a new force, created by this rocket, that opposes the forward velocity of the probe.

Define force

For visualization purposes, we want to tag the times when the rocket burns and shuts off, so we create a couple of state variables. NDSolve can be used to handle all of the acceleration, velocity, and position changes. In addition, we can use WhenEvent‘s functionality to handle several possible conditions to watch out for, and take specific actions if those conditions are met.

We can start with one possible scenario where we choose an arbitrary starting point and velocity, but the velocity and burn times result in the orbiter hitting the comet instead of obtaining a capture orbit.

Choose an arbitrary starting point and velocity

We can see that a single rocket burn takes place.

Single rocket burn

Red segments show where the rocket was fired to slow down.

Red segments show where the rocket was fired to slow down

With adjustments to the initial velocity and burn distances, we can achieve a more slowly decaying orbit that involves multiple burns.

Slowly decaying orbit

In this case, there are nine separate rocket burns resulting in orbits that start out highly eccentric, but then settle down into closer, more circular orbits.

Nine rocket burns

We can plot the velocity over the course of the orbital insertion and orbit phases. Velocities are measured in meters/second, and time is in seconds.

Velocity over the course of the orbital insertion and orbit phases

The ESA will have a much more difficult problem, likely requiring many separate burns over several months, and this is even before it attempts to drop a lander onto the comet in November. As the rocket approaches the comet, more information becomes available, such as refined mass estimates and the location of potentially hazardous debris from jets issuing from the icy comet. All of this must be carefully considered before refining the approach and orbital maneuvers. The next few months should be really exciting!

To experiment with the parameters of this simulation yourself, try using the deployed version created using CloudDeploy in the Wolfram Language.

Download this post as a Computable Document Format (CDF) file.

Emily Suess <![CDATA[Demographic Anomalies: US Edition]]> http://blog.internal.wolframalpha.com/?p=28049 2014-07-28T17:42:48Z 2014-07-28T17:07:25Z It’s been a while since we looked at American Community Survey data in Wolfram|Alpha. Our first efforts included surveying ACS data related to education, income, and diversity, only touching the tip of the iceberg.

Recently, we took a deeper look at the data to unearth some of the least “average” communities in the US.

As you might guess, at the national level, female and male populations are split almost evenly (50.8% and 49.2%, respectively). But there are metropolitan communities in the US where the split doesn’t hew to the national average, and the ACS data in Wolfram|Alpha lets us find them.

Take Susanville, CA, for example. At just 34.5%, this community in Northern California has the smallest percentage of female residents in the US.

Metropolitan areas with the lowest female population fraction

Or consider the number of married couples living in the United States. The data shows us that 51.4% of the national population ages 15 and older are currently married, but there are individual communities with standout numbers. Utah, at 58%, is the state with the largest married fraction. Dig deeper and you’ll discover that nearly 66% of the population is married in the Brigham City, UT metro area.

Metro areas with highest marriage percentage

When it comes to education, there are more than 2.6 million people in the United States with no formal schooling. That’s about 1.3% of the total population.

Among the 50 states and Washington, D.C., California tops the list with the highest percentage of citizens having no formal education.

What percent of people have no schooling in US?

On the flip side, the number of people with advanced degrees exceeds 10% of the nation’s total population. That means roughly 21.7 million people have a master’s degree or higher. Of the 50 states and Washington, D.C., the District of Columbia has the highest population percentage with advanced degrees at 28.7%.

What percent of people have advanced degrees in D.C.

We can access poverty data, too. In the following query, we can see what fraction of US senior citizens–individuals over the age of 65–are living below the poverty line. The national average is close to 9%, but the numbers vary significantly by state. Mississippi has the highest percentage, and Alaska has the lowest.

What percentage of senior citizens population is below poverty line in us states

We can also get numbers for more complicated queries. For example, Wolfram|Alpha can compute the percentage of people in Washington State metro areas speaking Tagalog at home.

What fraction of people speak Tagalog in WA metro areas

Or we can compare the population pyramids for the communities of The Villages, FL and Rexburg, ID.

The villages metro area vs Rexburg Idaho metro area population pyramid

Perhaps the best thing about being able to identify demographic outliers is that it helps us make more informed decisions about the places where we prefer to live and work. Which states have the highest educated populations? Where are the most diverse communities to raise a family? What communities are home to people who speak my native language? What cities would benefit the most from a non-profit organization helping seniors in need?