How Scientists deal with the problems and potentials of Big Data

The capacity to organize huge amounts of information at every step of the discovery, advancement and commercialization cycle is basic to achievement, and big data analytics is helping scientists change big raw data into actionable insights.

Chr.Hansen is a Danish supplier of bioscience ingredients for the food and health industries. When Chr. Hansen adopted an electronic laboratory notebook system, collecting and sharing data among its scientists, a flood of data engulfed and overwhelmed them. Earlier the scientists at Chr. Hansen had to manually analyse a compound and note down the results on paper. A system that had been in place for decades. Chr. Hansen also implemented robots and tools powerful enough to conduct 500 simultaneous analytical trials on a compound. That’s when the data input just exploded. They didn’t know how to handle the complexity and sheer enormity of the data they were producing.

Enter Morten Meldgaard and Kaare Buch Petersen to rescue. Meldgaard is the program manager tasked with improving how the 140-year-old company manages its data explosion; Petersen is an Information Technology (IT) specialist. The duo have leapt ahead of larger industry competitors by adopting cloud-based data-analysis solutions. Total costs per month for infrastructure and software as a service: about US$1,000.

The data explosion is a major headache for scientists working in fields like chemistry and biology, and research data is just the beginning. Other functions, including development, regulatory review, manufacturing and distribution, generate their own mountains of information.

“People are now overwhelmed with lots of data,” said Alan S. Louie, research director at IDC Health Insights, in Framingham, Massachusetts (USA). Storing data is a challenge, but “the ability to process that data and formulate it into coherent theories is much harder,” Louie said. Each stage of development and commercialization depends on data from the phase that comes before it, and each generates more data that needs to be fed back up the chain to make future activities more efficient.

Cloud computing can organize and help make sense of the clutter. The cloud model also streamlines collaboration as research companies can easily develop partnerships with specialized research or manufacturing companies.

“The cloud is quite ideal,” said Andrew Brosnan, an analyst for UK-based research firm Ovum. “It is easily scalable and easily extendable. If you have a two-year project with a contract research organization in Switzerland, you can extend the IT environment to that collaborator. That’s the trend we’re moving toward.”

Meldgaard and Peterson realized that cloud and the SaaS infrastructure is much easier and inexpensive than developing a system from the ground up. This helped Chr. Hansen’s scientists to quickly identify patterns in their data. “They saw our solution as a magic wand that could save them a lot of time,” Petersen said.

Numerous companies ail from the “dark data” syndrome, where useful data exists but can’t be searched for re-use or made broadly available to internal departments. But Chr. Hansen’s new tools have changed the company’s entire approach to storing and accessing data.

“We want to explore the data rather than just looking at the data in siloes,” Meldgaard said. “Big Data storage seems almost perfect for us. We can store it, and afterward start playing around with it and interpreting it. Traditional databases demanded that we pre-decide how we wanted the data to look and how it would be interpreted.”

Ease of access is powerful and convenient. “The scientists can go in there, get their hands on it and try out their ideas,” Meldgaard said. “They can pull out data and play with it and test their theories or visualize the data and the patterns.”

The next logical step is to implement the system across the other departments of the company. Product developers want access to production data, and vice versa; the sales, finance and legal departments want to be connected too.

The ultimate evolution would be to be able to tap into what consumers of end products say on Facebook or Twitter about a particular yogurt flavour, for example, making the data available to the scientists who are trying to improve products.