The Periodic Table Database project
Daniel C. Tofan
Eastern Kentucky University, Chemistry Department

Introduction

The goal of this project is to build an online Periodic Table backed up by a relational database. Using the database as the foundation, and the power of Java programming, the Periodic Table will be displayed and exported in a multitude of formats, based on the request of the user. The database will be an educational tool for anyone looking for information on the chemical elements, and also a source of data for professional applications. A proof of concept is proposed.

Rationale

There are many periodic tables online. So many, in fact, that it is difficult to justify building another one. However, none of them organize the data in the fashion I describe here.

This article that I published recently in Chemistry International, the IUPAC news magazine (freely available online), gives more background on how this project came about. Without repeating what I already published, I will emphasize that most of the data about the elements that is published online is either hard-coded within static HTML pages, or dynamically generated from some source of data which seams to be transparent to the user, and for the most part inaccessible in any format other than the display that the web designers choose to use for their sites.

My purpose is to organize most known data about the properties of the elements in one single structure. From that structure, data will be extracted on demand and displayed in a variety of formats, or used by other applications in computer readable formats.

Data storage choice

During the initial stages of development of this project, it became apparent that the choice for a suitable technology for data storage was down to two contenders: XML and relational databases. Initially, XML appeared as the logical choice. The Periodic Table seems suited for a tree-like structure in XML, and the "elements with attributes" paradigm applies well to the chemical elements and their properties. The difficulty, as I explain in more detail in the above referenced paper, lies in populating and editing the XML structure. Even with a user friendly tool such as XMLShell, which allows moving nodes around, copying and pasting, and other operations to be completed easily, it turned out to be a difficult task. The relational database solution won in the end. Below I list the advantages of each of the two technologies over the other, vis-a-vis to this particular project.

Advantages of XML

Advantages of a database

Because the data about the elements can be divided into categories (grouping similar properties together), and search capabilities are especially important, it appeared evident that the relational database solution is far superior.

There was also the issue of one table implementation versus multiple table. The database created by the Royal Society is a one table implementation. That is not a good idea, in my opinion, for several reasons:

A relational database design, based on linked tables, is an optimum design that eliminates data redundancy, saves space, and maximizes search speed. As far as which database product to use, I chose Access 2007 for its user friendly interface, and the fact that it stores the entire database in one file, which is automatically saved every time the database is updated. The one file design makes backing up the database a very easy process. Concurrent access to the database is handled by Access automatically, which is another selling point, considering that the database will be accessed by many concurrent users.

Table structure

There is one "main" table called "Elements", which uses the atomic number of the element as the primary key. Only the very basic information about the elements, as well as some descriptive properties, are stored in this table. There is only one record per element, which makes the table very clean. All other properties are grouped by type, in separate tables. Those tables are linked to the Elements table through the atomic number, which is a foreign key in each table linked to the main one. Thus, one-to-many relationships are built between the Elements table and each of the related tables. This design is very effective for properties where there is more than one record for each element, an obvious example being isotopes. Units are stored separately, in a table that is not linked. Each property will have a link to one particular record in the "Units" table, thus ensuring consistency between records. Exceptions are made in a few cases, such as the half lives of the radioactive isotopes, where each half life is reported in the most convenient unit, rather than converting everything into seconds. The latter is still an option, however.

The image below shows the current table structure of the entire database, with the exception of the Units table, which is not yet built. A PDF version will provide better clarity, and clicking on the image will bring up a larger version.

relationships

I am not discussing the detailed table structure here, since field names should be self explanatory for the most part. A few issues have come up when we started to populate the database. For example, allotropes are not yet represented. Since only few elements have allotropes, it makes sense to create a separate table for them. Then, each set of properties that are allotrope dependent should refer to a record in the allotropes table. This implementation is not yet clear, but it is definitely an important decision.

Another potential issue is the table of standard reduction potentials. The current design (show above) is too simplistic: it has the two oxidation states ("from" and "to"), the type of solution (acidic or basic), and the reduction potential. However, for a certain oxidation state of an element, there are many compounds, and reduction potentials will be different based on which compound is given. Listing the compound instead of the oxidation state (or both) would be one solution, however compounds are not going to be part of this database, and so there will be some unavoidable data redundancy. This is not a major issue, but simply an implementation detail that needs to be thought through, and will most likely lead to a change in the current table design.

Conversion to XML

XML is still desirable for applications that are designed to read this format, and for interoperability with other XML formats. The advantage of storing the data in a relational database rather than in an XML structure is also apparent from the fact that the XML can be generated on demand, instead of being used as the primary data holder. Because the XML structure can be generated programmatically, it is guaranteed to be well formed and valid. As long as the data entry process is accurate, the validity of the XML data is ensured through programming. DTD validation can be done, but it is unnecessary.

This link will open a sample XML file that was created by one of my students (see my other paper on a new course on computers in chemical education). The file contains the XML structure created for metals in groups 5 and 6. The number of properties listed is small. This particular XML structure illustrates one of the issues I had with the XML version. The main nodes in the structure shown here are the periodic table groups, but they could have been the periods just as easily. In the database design, this notion does not exist, as there is no tree structure to define.

Conversion to presentation formats

To display the data in HTML (for web) or PDF (for print), Java programming will be used to generate these formats. This will be a substantial effort, because of all the formatting information that needs to be programmed. The type and quantity of data to be displayed will also be a factor. The advantage, however, is that once the programming part is completed, data can be displayed dynamically upon request. A collection of Java servlets will be the transport and display mechanism between the database and the web server.

My other idea is to create an interactive Java applet that will display one property at a time for all elements, in a periodic table shape. By using colors and other methods of visually coding the magnitude of the elemental properties, it should be possible to display any of the periodic table trends interactively. Such attempts already exist using JavaScript, but the dynamic generation of the display using a database back-end is clearly more powerful and hides all server-side work.

The programmatic conversion of the data will make this educational resource available in several different formats, anywhere in the world. The mechanism I propose here should be very useful to other projects, such as the Isotopic Periodic Table project currently under development by the IUPAC (see this recent article in Chemistry International).

Current status

I built a periodic table that links each element symbol to a PDF slide that has some information about the element. These slides were created manually, but they represent one example of what could be generated programmatically using the database and the Java conversion mechanisms. Some progress has been made on populating a mock version of the database with information about the elements, using mostly data from WebElements. I am in the process of trying to secure funding in order to move the project forward.

I welcome any suggestions that can contribute to the improvement of the design and content of this important project. My aim is to create a comprehensive, open source of information about the chemical elements, that can interact with other applications through the mechanisms that I described in this paper.

Acknowledgments

Students in CHE 111, 501, 701, 805 courses at Eastern Kentucky University who contributed to the early versions of the project. Special thanks to Jennifer Imel, Sasha Howard Porter, and Victoria Diersen.

Fred Bayer, who provided the images of the elements used in the actual PDF slides on my website.

WebElements, Wikipedia, IUPAC.

 

Return to the contents page of the Fall 2008 issue of the Computers in Chemical Education Newsletter.