Links
https://substrait.io/
https://github.com/substrait-io/substrait
Concept
Why not use SQL?
POSIX SQL is a well known language for describing queries against relational data. It is designed to be simple and allow reading and writing by humans. Substrait is not intended as a replacement for SQL and works alongside SQL to provide capabilities that SQL lacks. SQL is not a great fit for systems that actually satisfy the query because it does not provide sufficient detail and is not represented in a format that is easy for processing. Because of this, most modern systems will first translate the SQL query into a query plan, sometimes called the execution plan. There can be multiple levels of a query plan (e.g. physical and logical), a query plan may be split up and distributed across multiple systems, and a query plan often undergoes simplifying or optimizing transformations. The SQL standard does not define the format of the query or execution plan and there is no open format that is supported by a broad set of systems. Substrait was created to provide a standard and open format for these query plans.
Why the name Substrait?
A strait is a narrow connector of water between two other pieces of water. In analytics, data is often thought of as water. Substrait is focused on instructions related to the data. In other words, what defines or supports the movement of water between one or more larger systems. Thus, the underlayment for the strait connecting different pools of water => sub-strait.