Big Datasets from Small Experiments

By Promotion | Published on Jul 26 2017
Big Datasets from Small Experiments

Modern experiments produce lots of data. Abundant data is not exclusive to the "big gun" institutions, such as observatories and particle colliders. It is also the norm in modest-size labs working on anything from genomics to microscopy. Even outside of science you don't need to go far to find lots of data. With the Internet-of-Things (IoT) at play, a modern smart home is a continuous source of big datasets! With data collection as easy as it is, how does one analyze the data efficiently?

The work of Prof. Jeffrey Dunham connects real-world phenomena to data collection to computing in a very pure experiment. He has built a tabletop-scale chaotic pendulum equipped with a high-precision rotary encoder. The pendulum produces hundreds of gigabytes of data per day. This data reveals the strange attractor of the pendulum, which is a fractal. This manifestation of "order in chaos" is not only a thing of beauty. It has roots in chaos theory, which also applies to climate studies, biology, cryptography, and technology. However, the amazing fractal structure of the data emerges only with proper post-processing. “Proper” means that the experimenter must scan a parameter space of the Savitzky-Golay filter. For each point, the computationally expensive filter must be applied to the entire dataset. For good science in this experiment, computational performance is paramount.

In his upcoming presentation in Modern Code Contributed talks ("MC² Series"), Prof. Dunham shares his experience with this computational challenge. He talks about the modern code practices that allowed him to shrink the data processing time from hours to fractions of a second. That was made possible through two factors. The first one is the usage of an Intel® Xeon Phi™ processor (formerly Knights Landing). The second one is a thoughtful approach to parallel programming. Prof. Dunham also talks about probing the peak performance of these processors, the roofline model, and the importance of vector arithmetics.

For more such intel IoT resources and tools from Intel, please visit the Intel® Developer Zone

Source:https://software.intel.com/en-us/blogs/2017/07/06/big-datasets-from-small-experiments

Videos

Microsoft to roll out Windows 10 in phases anticipating big demand
logo
Promotion

Digit caters to the largest community of tech buyers, users and enthusiasts in India. The all new Digit in continues the legacy of Thinkdigit.com as one of the largest portals in India committed to technology users and buyers. Digit is also one of the most trusted names when it comes to technology reviews and buying advice and is home to the Digit Test Lab, India's most proficient center for testing and reviewing technology products.

We are about leadership-the 9.9 kind! Building a leading media company out of India.And,grooming new leaders for this promising industry.