Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ML reactivity when there are many files #15221

Merged
merged 1 commit into from Dec 20, 2022
Merged

Conversation

petersg83
Copy link
Contributor

@petersg83 petersg83 commented Dec 20, 2022

What does it do?

Add indexes for sortable columns of files

Why is it needed?

When files are retrieved, they are sorted by a field like created_at. When there are many files, like 300k, the API response becomes slower (~1 to 2 sec). Adding an index improve the speed of the sort operation.

How to test it?

  1. Upload a file to the ML
  2. Go to your database, in the files table
  3. Copy the file row (without the ID) and paste it 200k times (it's a bit long to do but you can copy several rows at the same time. You can also do an SQL loop but not on SQLite)
  4. Go to the ML and check the request time before and after the fix.

Related issue(s)/PR(s)

Fix #14954

@petersg83 petersg83 added source: core:upload Source is core/upload package pr: enhancement This PR adds or updates some part of the codebase or features labels Dec 20, 2022
@petersg83 petersg83 added this to the 4.5.5 milestone Dec 20, 2022
@codecov
Copy link

codecov bot commented Dec 20, 2022

Codecov Report

Base: 60.34% // Head: 60.34% // No change to project coverage 👍

Coverage data is based on head (daf3092) compared to base (95ec21c).
Patch has no changes to coverable lines.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #15221   +/-   ##
=======================================
  Coverage   60.34%   60.34%           
=======================================
  Files        1367     1367           
  Lines       33268    33268           
  Branches     6353     6353           
=======================================
  Hits        20076    20076           
  Misses      11347    11347           
  Partials     1845     1845           
Flag Coverage Δ
back 50.37% <ø> (ø)
front 64.80% <ø> (ø)
unit_back 50.37% <ø> (ø)
unit_front 64.80% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Comment on lines +111 to +115
{
name: `upload_files_created_at_index`,
columns: ['created_at'],
type: null,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read a conversation somewhere about indexes in strapi.
If I recall correctly, this is not ready for the public yet, but I imagine you can still use this syntax in any of your content-types right?

What work would be needed to make it public? Specify the list of fields you want to index to automatically generate a index name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly :) and to convert the field name to the database column name (createdAt => updatedAt). Also some thinking to have about columns that don't appear in the user schema (like createdAt and updatedAt)

Copy link
Contributor

@Marc-Roig Marc-Roig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, using sqlite I see a very little increment in the disk when the indexes are added indeed.

Seeing this PR I am thinking of preparing some performance tests in the future, to spot which parts are slower and to decide when to use indexes.

@petersg83 petersg83 merged commit 312a555 into main Dec 20, 2022
@petersg83 petersg83 deleted the fix/faster-ML branch December 20, 2022 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: enhancement This PR adds or updates some part of the codebase or features source: core:upload Source is core/upload package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Media Library give error when we work with 300k images
3 participants