Search engines are databases that specialize in retrieving information from a data corpus. Compared to traditional databases like PostgreSQL, search engines allow to work with text and other unstructured data more efficiently. Projects like Xapian and Lucene can perform efficient indexing and querying of large amounts of documents. Solr and Elasticsearch have added clustering and distributed query execution to scale out the search features.
The most obvious gap between traditional databases and search engines is the query language. Whereas relational databases can typically be queried with SQL, search engines usually implement a custom API.
At CrateDB, we don’t think you should have to give up SQL just because you’re using search engine features. That’s why we created a fully-functional SQL interface on top of Elasticsearch and Lucene. You get all the benefits of traditional databases, as well as the features of a distributed search engine.
Do you want to store huge amounts of data and search it in real time? Do you have unstructured and structured data? Do you want to run distributed joins? Do you want to add nodes and scale your cluster horizontally? Do you want to leverage the power of SQL? If so, CrateDB is a great match.
In this talk, I will give an introduction to CrateDB, its architecture, and show what people have built with it.