Hive is an open-source Data Warehouse system built atop Hadoop. It provides a SQL-like interface for querying and analyzing large datasets stored in distributed file systems. See also Big Data SQL Spark MapReduce