A Year in a Chemistry Developer’s Shoes
I have always seen the beauty of chemistry from a scientific standpoint: strange shapes, eye-catching patterns, giant explosions… But it was not until I came to Wolfram|Alpha that I began to appreciate just how sleek chemistry is from a programming perspective. Just a few lines of code are needed to create some of the most startling phenomena and give life to elegant theories.
In Wolfram|Alpha, some property values are stored in a database and are called non-computed properties; the remaining properties are called computed properties because they are calculated from the non-computed properties. MoleculePlot is a computed property, and so generating the above plot takes a bit more than one line of code (internally, the main function takes roughly 1,000 lines of code). Yet even if you restrict yourself to non-computed properties, you can create such diagrams with very little code. Here is a basic diagram of acetic acid from non-computed properties compared to the computed property CHBlackStructureDiagram.
Of course, this code is bare bones: it cannot display ions or isotopes, it will squish large molecules, and it does not have color. Nevertheless, for a few lines of code, it does a surprisingly good job.
To give you some perspective, in this past year I added 35,000 lines of code and modified an existing 30,000 lines of code. Much of it went into creating new functionality such as our periodic table (200 lines of code). The beauty of this periodic table lies not only in its new color scheme (which matches Mathematica 10’s color scheme, and can be seen through the command ColorData), but also in the pull-down menu featuring neat properties such as boiling point, electronegativity, and the year each atom was discovered.
A lot of work went into upgrading our framework so that when you query for properties such as MoleculePlot or CHBlackStructureDiagram, the result comes back quickly and efficiently. Sometimes the renovations are quick and simple, such as showing as much data as possible (20 lines of code); sometimes it requires completely re-thinking how we handle objects such as chemical reactions (5,000 lines of code and a lot of thought).
While it is fantastic to find an answer to your query, it is even better to understand the procedure used to get to that answer. One of the most exciting new directions we took this year was to expand our Step-by-step functionality into chemistry.
While computing ATP’s MoleculePlot above may be an intricate task, writing these Step-by-steps can quickly become breathtakingly complicated. As with all of Wolfram|Alpha, the steps are written using Mathematica, which means it is up to the developer to program the grammatical structure. As you might imagine, this can cause what appears to be a very basic sentence to be much more complicated under the hood. Let’s take a basic example: the second step lists the number of valence electrons in the chemical. There can be one, two, or three or more types of atoms—each of which requires different grammatical syntax. Additionally, other clauses may be stuck in the beginning, middle, or end of the sentence to describe special cases (e.g. for chemicals with a nonzero net charge), each of which requires support for punctuation, capitalization, and grammar.
So far, so simple. But to do a really great job, the steps have to vary their diction, intelligently short-circuit to the answer whenever possible, and generally sound human. We want for you to think of Wolfram|Alpha as your best friend, not only ready to answer your questions, but also to explain how she arrived at her answers.
All this would be tricky, but still straightforward, if it were not for all of the crazy exceptions that abound in chemistry! For practically every rule you can think of involving molecular bonding, there is an exception out there (as you learn from your first chemistry course); with over 40,000 molecules in our database, you can test any hypothesis and immediately find counter-examples. As the steps progress, they can branch off into an incredible number of different paths. It’s a thrilling ride to construct such an algorithm in a cogent and coherent pattern (not to mention generating robust tests for both calculation errors and grammatical inconsistencies)!
That is what made creating the Lewis structure Step-by-step such an exciting experience, especially after I discovered that the standard procedure to draw Lewis structures fails for many molecules in our database. Ultimately, the end product was a very robust Step-by-step that accounts for all ChemicalData entities. (For full disclosure: the steps do not work for 3-center 2-electron bonds or 3-center 4-electron bonds, such as in diborane or bifluoride. Therefore, the steps can handle all of our chemicals except 2.)
To give some perspective, simply drawing the Lewis dot structure diagrams took 200 lines of code; the Step-by-step took 1,000 lines of code.
User feedback was very positive for this first Step-by-step, prompting us to expand the functionality to other areas of chemistry. We decided to first tackle the most popular chemistry queries in Wolfram|Alpha, aiming primarily at important topics someone would learn in an introductory chemistry course. In addition, the Step-by-steps should fit together and complement one another. Creating the Lewis structure Step-by-step naturally suggested calculating oxidation states. We also wanted to break into chemical reactions: balancing chemical reactions begot computing reaction stoichiometry, which begot converting between units, which begot preparing solutions.
We tried to make each Step-by-step encompass as wide a field as possible. For example, the stoichiometry Step-by-step will calculate the theoretical yield of products if only given amounts for reactants (e.g. 2 grams glucose + oxygen -> water + carbon dioxide), will calculate the percentage yield of products if given amounts of reactants and products (e.g. 0.2 mol CH4 + O2 -> 7 mL H2O + CO2), and will calculate the amount of reactants needed if only given amounts of products (e.g. C6H6 + NO2+ -> 0.02 mols C6H5NO2 + H+). We plan to continue expanding to other areas, and user feedback always helps us prioritize what to tackle next!
For users who want to get into the driver’s seat, Wolfram|Alpha’s tighter integration into Mathematica 10 grants mouth-watering access to our chemistry data (along with element, isotope, and thermodynamic data). We exposed over 50 new properties and gave the remaining properties a serious tune-up, including upgrading to the new Version 10 formats for Quantity, Entity, and Association!
Both new and experienced users should check out the ChemicalData documentation page where all the chemistry functionality is explained along with numerous examples on how they can be used. Once you are ready and eager to access the data, we recommend using the Ctrl+= interface to discover entities and properties. For example, typing “benzene” into this interface will yield the right-hand side of:
Alternatively, you can peruse our entire list of chemicals. Here’s a random sample of five entities.
Each entity has a list of properties that you can query. The full list is rather large, so let’s just focus in on properties that begin with the letter M.
You can now use ChemicalData to perform an entity-property query and to access the data for each of these properties
You can use free-form input to discover a specific entity-property query directly by typing it into the Ctrl+= interface. For example, typing in “benzene molar mass” yields a ChemicalData expression that you can evaluate to find the molar mass.
Because this interface links directly to Wolfram|Alpha, it grants you significant flexibility on how you enter your input. For example, you can alter the order of the input (“molar mass benzene”), insert filler words (“what is the molar mass of benzene”), use generic phrases (“benzene mass”), misspell words (“benzne masss”), and so much more. Wolfram|Alpha has got your back!
With that introduction, you have all the tools to play around in various fields of chemistry. For example, you can do a systematic study of your favorite molecule—let’s pick cyclobutane. One of the important features of cyclobutane is its bonding structure, so let’s consider the chemicals whose graphs are isomorphic to cyclobutane’s along with their boiling points and molar masses.
Here we find both the general trend that boiling point increases with molar mass as well as the (expected?) exceptions—the chemicals in the first and third slots. This invites a series of questions: Do fluorocarbons like octafluorocyclobutane always have a significantly lower boiling point than their hydrocarbon counterparts? Is fluorine special, or would the same phenomenon happen with other halogens? Could we make a program that can predict a molecule’s boiling point based on its structure diagram? There are countless directions to go!
The combination of Mathematica and Wolfram|Alpha transforms chemistry into an incredibly fun subject to explore. What amazing things can you create once you have the right tools for the job? We have worked to make chemistry not just intellectually stimulating and visually appealing, but a truly sexy aspect of the Wolfram Language. As we continue expanding Wolfram|Alpha’s capabilities, we welcome your recommendations. Share your thoughts on how to make chemistry even more beautiful, suggest what future directions you would like us to delve into, or simply shout out to the world that you love Wolfram|Alpha!