AppUnite LogoBlog
< Go to homepage
Contact
Engineering

Łukasz Pauszek

Share:
Share on FacebookShare on TwitterShare on LinkedIn

Ecto.Query.preload/3 vs. Ecto.Repo.preload/3

| 2 min read

When I submitted a PR, a friend asked:
Why are you using “Repo.preload/3” and not “Query.preload/3?
And then the second one pitched in as he saw preload as the first function call in one of my longer queries: Why are you preloading for all of the records?

It got me thinking about which solution is better. Let’s find out what the difference is between Ecto.Query.preload/3 and Ecto.Repo.preload/3.

## TL;DR

For those in a rush here you can get to know the essence of my findings.

When using either of preload/3 for it's default job to load associations for every record, like:

Demo.Blog.Post |> Ecto.Query.preload(:comments) |> Demo.Repo.all()
# or
Demo.Blog.Post |> Demo.Repo.all() |> Demo.Repo.preload(:comments)

It doesn’t matter which one we use. These generate the same queries. Looking at time efficiency, they are comparable.

Nevertheless, there are areas in which one is better than the other. Like, Ecto.Query.preload/3 has the ability to limit the number of queries by using join and Ecto.Repo.preload/3 can limit the number of records fetched by not loading again already present associations.

How did I get to these conclusions? Let’s read ahead and find out together.

hexdocs.pm

My first point of interest was documentation pages. I wanted to verify whether official documentation will provide some answers right away.

Ecto.Query.preload/3 \ Preloads the associations into the result set.

Ecto.Repo.preload/3 \ Preloads all associations on the given struct or structs. This is similar to Ecto.Query.preload/3 \ except it allows you to preload structs after they have been fetched from the database. \ In case the association was already loaded, preload won't attempt to reload it.

After reading this, I felt a bit disappointed. These short descriptions provided only part of the information I was after. They covered what I have already known - you can preload associations in two different places:

  1. while fetching records,
  2. when they are already present.

I wasn’t satisfied with this. I wanted to know more, so I ditched documentation for now and went ahead to step number two.

The code

As described in the docs, both preload/3 work the same way fetching associations. They generate separate query (or queries, depending on how many associations we want to fetch) to get all needed records.

defmodule Demo.Blog.Posts do
  use Ecto.Schema
  import Ecto.Changeset

  schema "posts" do
    field :content, :string
    has_many(:comments, Demo.Blog.Comments, foreign_key: :post_id)

    timestamps()
  end
end

defmodule Demo.Blog.Comments do
  use Ecto.Schema
  import Ecto.Changeset

  schema "comments" do
    field :content, :string
    belongs_to(:post, Demo.Blog.Posts)

    timestamps()
  end
end

So I have created simple relations, put some records in DB and started querying to get real answers for questions that scratched at the back of my head.

Are queries generated by Query.preload/3 and Repo.preload/3 different?

# Demo.Blog.Posts |> preload(:comments) |> Demo.Repo.all()
SELECT p0."id", p0."content", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 []
SELECT c0."id", c0."content", c0."post_id", c0."inserted_at", c0."updated_at", c0."post_id" FROM "comments" AS c0 WHERE (c0."post_id" = ANY($1)) ORDER BY c0."post_id" [[2, 1]]
# Demo.Blog.Posts |> Demo.Repo.all() |> Demo.Repo.preload(:comments)
SELECT p0."id", p0."content", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 []
SELECT c0."id", c0."content", c0."post_id", c0."inserted_at", c0."updated_at", c0."post_id" FROM "comments" AS c0 WHERE (c0."post_id" = ANY($1)) ORDER BY c0."post_id" [[2, 1]]

Aaaand they look the same, not what I had expected, but good to know. Next question…

Does Query.preload/3 preloads for all associations?

# Demo.Blog.Posts |> where(content: "Post text etc") |> preload(:comments) |> Demo.Repo.all()
[debug] QUERY OK source="posts" db=4.0ms queue=6.2ms idle=1766.6ms
SELECT p0."id", p0."content", p0."inserted_at", p0."updated_at" FROM "posts" AS p0 WHERE (p0."content" = 'Post text etc') []
[debug] QUERY OK source="comments" db=1.1ms queue=2.0ms idle=1777.1ms
SELECT c0."id", c0."content", c0."post_id", c0."inserted_at", c0."updated_at", c0."post_id" FROM "comments" AS c0 WHERE (c0."post_id" = $1) ORDER BY c0."post_id" [1]

Yes! Finally, some answers I was looking for! When we preload from a query, we only do so for the records that were left after filtering. This is further confirmed by the docs (but in my opinion, it’s not said explicitly enough)

… fetch (…) from the database and then do a separate query returning all (…) associated …

Can I use queries when preloading data?

Most queries that require preload use what I like to call a default form. Which is a passing list of atoms such as:

(...)
|> preload([:comments, :author])
(...)

or

(...)
|> preload([:comments, author: [:friends]])
(...)

Surprisingly you can also use queries inside preload statements to limit, order or filter preloaded records. This will still produce two queries, but you gain more control over associations and what is important here, both Ecto.Query and Ecto.Repo functions support this. Some of use cases, for example, Post -> Comment associations include:

  • limiting number of comments fetched for a single post
  comments_query = Comments |> limit(5)
  Posts |> Ecto.Query.preload(comments_query) |> Repo.all()
  • ordering comments e.g. by date of creation
   comments_query = Comments |> order_by(:inserted_at)
  Posts |> Repo.all() |> Repo.preload(comments_query)

What we do need to watch out for is that such queries are working on a whole subset of data we are fetching, that means that limit/2 in the example above will fetch and associate only 5 comments for all of the posts, not 5 comments for each post.

Is there a way to limit number of queries when preloading?

Short answer: yes, but only for Ecto.Query.preload/3, we can use join statements to select and filter our associations \ in the same query. The example below allows for fetching posts and associated comments with single query.

   # Demo.Repo.all(
   # from p in Demo.Blog.Posts,
   # join: c in assoc(p, :comments),
   # where: c.inserted_at > p.updated_at,
   # preload: [comments: c]
   # )
[debug] QUERY OK source="posts" db=13.6ms queue=2.0ms idle=578.7ms
SELECT p0."id", p0."content", p0."inserted_at", p0."updated_at", c1."id", c1."content", c1."post_id", c1."inserted_at", c1."updated_at" FROM "posts" AS p0 INNER JOIN "comments" AS c1 ON c1."post_id" = p0."id" WHERE (c1."inserted_at" > p0."updated_at") []````

So what is better? Right now, I can only answer: it depends. As in almost everything we do, it’s a case by case question we need to answer to achieve the best results.

Related articles

Engineering

How to introduce yourself to a new team

Jacek Marchwicki

When you join a new team, you usually want to show that you’re a valuable person to work with. Here you'll find some personal opinion based on our developer's experience.

Business & Strategy

What’s the best solution for your app: “No code” or traditional development?

Amelia Kaczmarek

Should you give “no code” a try or rather stick to traditional development methods in terms of developing your new app or a product?

Product Managment

5 tips for software engineers transitioning to the team leader role

Piotr Mądry

The demand for software is growing continuously. Newly created teams need strong leadership and leaders are often chosen among software engineers. What to expect and how to prepare?

This website stores cookies on your computer. The data is used to collect information about how you interact with our website and allow us to remember you. We use this information to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. Cookie Policy Privacy Notice

We'd love to build something amazing together!

Make the first step for a great partnership! Share your idea with us and check what we can do for you and your company.

Start a project
ClutchFinancial TimesForbes