Big Time Series Analysis with JuliaDB

Dr. Josh Day of Julia Computing takes a look into the multi-indexed database of the future

The next generation of data analysis requires the next generation of tools. The most popular opensource packages for data analysis (Python’s pandas and various R packages) are designed to work with small files of basic data types, but ‘small’ and ‘basic’ do not describe the data landscape of the future. The amount of data in the world is growing exponentially, and as The Economist observes, it’s changing as it grows:

“The quality of data has changed, too. They are no longer mainly stocks of digital information – databases of names and other well-defined personal data, such as age, sex, and income. The new economy is more about analyzing rapid realtime flows of often unstructured data: the streams of photos and videos generated by users of social networks, the reams of information produced by commuters on their way to work, the flood of data from hundreds of sensors in a jet engine.”

The current generation tools therefore face a number of difficulties in analyzing the next generation of data. The first is that of scale, which can be achieved with distributed computing systems like Hadoop and Spark, but loses the ease of use that make Python and R tools attractive.

Scaling an analysis also adds costs in the form of gluing together tools that may not support the same data types or operations (e.g., Spark DataFrame to Pandas DataFrame to numpy array to scikit-learn model). Another issue for current databases is storing nonstandard data types. A database can sometimes work around unsupported types (e.g., units and currencies) by attaching metadata to a field, but the same approach is harder to apply to more complicated data like images and video. The next-generation database should therefore offer the features that are lacking in the current  generation:

  • Scalability (works equally well on Small and Big Data)
  • Ease of use (no need to glue together different formats)
  • Flexibility (stores data types that may not exist yet).

Introducing JuliaDB

JuliaDB aims to be the analytics database of the future. It is implemented entirely in Julia, a high-performance language for technical computing designed around modern technologies such as just-in-time compilation, type inference, and parallelism.

Logged-in members can download the article by clicking the link under all the “Related Posts” below. If there isn’t a link then you aren’t logged in! To log in or register visit here.

 

Related Posts

Automatic Differentiation for the Greeks The sensitivities of the value of an option to the model parameters, a.k.a. “the Greeks,” are crucial to understanding the risk of an option posit...
Valuation of American Call Options The purpose of this paper is to provide an analytical solution for American call options assuming proportional dividends. Proportional dividends are m...
Life Settlements and Viaticals In this chapter… • life • sex • death 1 Introduction And now for something completely. . .morbid. Life settlements and viaticals are contra...
Calibration problems – An inverse problems v... When pricing structured or derivative financial instruments, the typical steps a quant has to do are the following: 1. Choose a model for the m...
Practical Valuation of Power Derivatives In this paper I look at the practical valuation of power derivatives from a trader’s perspective. Most people that have written about valuation of pow...
What is Implied by Implied Volatility? Word and concept Implied volatility is not just a word or a concept. As a word, what is implied by implied volatility – what “implied volatili...
Introduction to Variance Swaps The purpose of this article is to introduce the properties of variance swaps, and give insights into the hedging and valuation of these instrument...
Monte Carlo in Esperanto This article shows how a simple parser environment in Excel/VBA could be used to perform single and multi-dimensional Monte Carlo. The clsMathParser i...
Big Time Series Analysis with JuliaDB
10-13_julia_final_may18