The Google Sheets QUERY function uses the Google Visualization API Query Language which offers a syntax similar to Structured Query Language (SQL) and if you have an understanding of SQL it offers a intuitive and familiar method of working with data stored in Google Sheets.
In this post, after describing the sample data, the QUERY function syntax is then explained as is why naming your data is a good idea and will make things easier for you in the future. The post then moves on to examples which demonstrate the various clauses such as SELECT, WHERE, ORDER BY and aggregation functions before wrapping up with where you can find out more.
In this post I have used the English Premier League table as it was in January 2022, you can find this on GitHub.
The QUERY function syntax
QUERY(data, query, [headers])
- data – The range of cells to perform the query on. This can be expressed as a range such as A1:J21 or as a name. The examples in this post will use a name and how to name your data is explained in the next section.
- query – The query to be performed, enclosed in quotation marks and written in Google Visualization API Query Language.
- headers – this is an optional argument and is used to specify the number of header rows at the top of data.
An example would be:
=QUERY(LeagueTable, "SELECT *", 1)
- LeagueTable – The name of the name of the data that the query will be run against.
- “SELECT *” – An example of the Google Visualization API Query Language. SELECT * selects everything in effect making a copy of data.
- 1 – Informs the QUERY function that the header is one row.
Naming your data
The first step is to provide a name for the source data. Whilst it is not mandatory to do this, I recommend it because it shows your intent especially
if when you have to revisit and change the function several days/weeks/months later.
To provide a name highlight the source data as shown below.
Next from the menu bar select Data and then Named ranges
Enter a name, I have used LeagueTable then select Done.
Selecting * or Making a copy
This first example will create a copy of the existing source data.
Enter the following in a cell a few line below the existing data.
=QUERY(LeagueTable, "SELECT *", 1)
The output of this query is shown below:
Select a subset of columns
To select specific columns, use the column name.
=QUERY(LeagueTable, "SELECT A,B,J", 1)
Using this example selects the league postion, team name and their points.
Selecting information which meets a criteria
Building on the last query, if you wanted to restrict the results to those teams which have 38 or more points you can add a WHERE clause. This clause is used to return values that match a specified condition which in this example is equal or greater than 38 which is expressed as >= 38
=QUERY(LeagueTable, "SELECT A,B,J WHERE J >=38", 1)
If the column you want to restrict on contains text then the criteria should be enclosed in quotes. If you wanted to just return information about the Watford team, the query would look like this:
=QUERY(LeagueTable, "SELECT A,B,J WHERE B = 'Watford'", 1)
Note the single quotes around Watford.
Ordering your data
The ORDER BY clause is used sort your data by the column(s) specified. The example below shows the teams which have 38 or more points with the result ordered by the team name.
=QUERY(LeagueTable, "SELECT A,B,J WHERE J >=38 ORDER BY B", 1)
There are a number of useful built in aggregation functions They are straightforward to use if applied to the whole source data. Shown below is how they are used against the example data:
=QUERY(LeagueTable, "SELECT SUM(J), AVG(J), MAX(J), MIN(J), COUNT(J)", 1)
If you want to use aggregations within subsets or groups of data you will need to use the group by clause.
Where to go to find out more?
I am a fan of Seth Godin’s corpus of work and it was this post which first made me aware of the QUERY function.
This post by Ben Collins for providing an overview of the capabilities of the QUERY function.
The official Google Sheets QUERY function documentation.