Today I Learned

PostgreSQL count estimate

| 1 min read

Counting records in a database in tables that store millions of entries using the COUNT function can be time-consuming and resource demanding, as we can see in the attached analysis.

(
  sql = "EXPLAIN ANALYZE SELECT COUNT(*) FROM locations;"
  
  Ecto.Adapters.SQL.query!(Repo, sql)
)
Aggregate
(cost=323.03..323.04 rows=1 width=8)
(actual time=0.723..0.724 rows=1 loops=1)
  -> Index Only Scan using locations_pkey on locations 
(cost=0.28..319.21 rows=1529 width=0)
(actual time=0.061..0.574 rows=1536 loops=1)

Heap Fetches: 75
Planning Time: 0.128 ms
Execution Time: 0.756 ms

There is a better solution, as long as we do not need precise calculations, PostgreSQL allows us to estimate the number of records which is more efficient and faster.

(
    sql = """
    EXPLAIN ANALYZE
  SELECT reltuples::bigint
  FROM pg_catalog.pg_class
  WHERE relname = 'locations';
  """
  
  Ecto.Adapters.SQL.query!(Repo, sql)
)
Index Scan using pg_class_relname_nsp_index on pg_class 
(cost=0.28..8.30 rows=1 width=8) 
(actual time=0.120..0.121 rows=1 loops=1)

Index Cond: (relname = 'locations'::name)
Planning Time: 0.250 ms
Execution Time: 0.140 ms