by clicking on the page. A slider will appear, allowing you to adjust your zoom level. Return to the original size by clicking on the page again.
the page around when zoomed in by dragging it.
the zoom using the slider on the top right.
by clicking on the zoomed-in page.
by entering text in the search field and click on "In This Issue" or "All Issues" to search the current issue or the archive of back issues respectively.
by clicking on thumbnails to select pages, and then press the print button.
this publication and page.
displays a table of sections with thumbnails and descriptions.
displays thumbnails of every page in the issue. Click on a page to jump.
allows you to browse through every available issue.
GCN : May 2014
ONE OF MY CURRENT proj- ects is the review of an appli- cation built by a contractor for a major federal agency. The code relies heavily on queries and stored procedures against a relational database manage- ment system (RDBMS). These are easily the most complex I have ever seen. Normally when technical debt accrues in a database, IT managers refactor the dataset design immediately and use views, indexes, precomputation and all the other goodies a RDBMS offers. However, it may very well be that the relational paradigm, as venerable and successful as it is, is simply not the best choice for an applica- tion. There are alternative paradigms, notably document and graph databases -- col- loquially known as NoSQL databases -- that have their own advantages and disad- vantages. But let s first re- mind ourselves why RDBMSs have dominated for so long before exploring how NoSQL databases compare. RELATIONAL DATABASES With theoretical foundations in mathematics and powerful commercial and open-source implementations (e.g. Oracle and PostgreSQL respectively), RDBMSs have flourished for decades. They have powerful attributes we take for granted, including atomicity, consis- tency, isolation and durability (the ACID guarantees), as well as SQL, ad hoc queries and commercial support features. Yet RDBMSs aren t perfect. When the data gets big, they don t generally scale easily even with sharding.Database schemas are also notoriously rigid in application develop- ment. Changes to a single column reverberate among views, stored procedures and application code. DOCUMENT DATABASES Sometimes features like ACID can be more trouble than they re worth. Enter docu- ment databases like MongoDB and Apache CouchDB -- both open source. "Documents" are flexible binary JSON structures where child records modeled with relationships in a RDBMS are instead embed- ded within the documents. Aside from scalability, the biggest advantage is sim- plicity. Once one becomes familiar with JSON, querying is straightforward. Without a predefined schema, data can evolve as needed. Joins are obviated because data is denormalized. And because JSON is the most common data format on the Web, developers may be able to use a driver to pass data from the database straight through to the front end. On the other hand, if records embedded in a docu- ment change frequently (such that they are factored out into their own documents in a rela- tional manner), the result may be foreign key relationships without ACID guarantees -- meaning possible orphan records. Consider document data- bases when queries can be anticipated or when there is need to scale and/or make relatively few updates to pre- existing data. GRAPH DATABASES Anyone who watched The Wire might remember the bul- letin board used by the Major Crimes Unit to display an evolving org chart for crime syndicates in Baltimore. The hierarchy was determined through analysis of communi- cations and other data. That was the first graph database I ever saw. With a graph database, data is modeled as a collec- tion of nodes connected by edges -- both endowed with attributes. As always, the data model must be optimized for the anticipated queries -- for example, when deciding whether certain data belongs in a node or edge. This is a fundamental shift from RDBMSs. When working with network data (such as SIGINT, financial transactions, or migration patterns) model- ing in tables and relationships can be awkward. Much worse, RDBMSs can be quite slow for the kinds of queries that matter on graphs like shortest paths, community detection and centrality. Also built upon a math- ematical foundation, graph databases like open-source Neo4J are ideal for storing and querying network data. Like RDBMSs, Neo4J supports ACID transactions and index- ing. Commercial support and drivers are available in all ma- jor programming languages. Data is the lifeblood of ap- plications. While RDBMSs will always be robust and powerful and perhaps most familiar, follow the advice of lean software development experts Mary and Tom Poppendieck to consider all options to make applications easier to develop and faster to run. • --- Neil A. Chaudhuri is founder and president of Vidya and has over a decade of experience building complex software projects for commercial and government clients. WHEN RELATIONAL DATABASE MANAGEMENT SYSTEMS MAY JUST NOT BE THE BEST CHOICE INDUSTRY INSIGHT BY NEIL CHAUDHURI While RDBMSs will always be robust, consider all options to make applications easier to develop and faster to run. GCN MAY 2014 • GCN.COM 17