-
Notifications
You must be signed in to change notification settings - Fork 12
Use case example: Pre seeded sqlite use case #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| --- | ||
| title: "Pre-Seeding SQLite Databases" | ||
| description: "Optimizing Initial Sync by pre-Seeding SQLite Databases." | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. case inconsistency |
||
| --- | ||
|
|
||
| # Overview | ||
|
|
||
| When syncing with large amounts of data, it can be useful to pre-seed the SQLite database with an initial snapshot of the data. This can help to reduce the initial sync time and improve the user experience. | ||
|
|
||
| To achieve this, you can run server-side processes to pre-generate and seed SQLite files. These files can then be uploaded to blob storage, such as AWS S3, Azure Blob Storage, or Google Cloud Storage, to be downloaded directly by the client applications, bypassing the initial sync process. | ||
|
|
||
| The [PowerSync Node.js SDK](/client-sdk-references/node) comes in handy in these scenarios, because you can use it to on the server in Node.js applications. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. grammar error |
||
|
|
||
| <Card title="nodejs-react-native-sqlite-seeder" icon="github" href="https://github.com/powersync-community/nodejs-react-native-sqlite-seeder"> | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This card appears without context, should it be accompanied with some description text? |
||
| Example repo using the a self-hosted PowerSync instance connected to a PostgreSQL database, PowerSync Node.js SDK, React Native SDK and AWS S3. | ||
| </Card> | ||
|
|
||
| # Main Concepts | ||
|
|
||
| ## Generate a client specific JWT token | ||
| In the event you want to dynamically populate the SQLite database with data, you can generate a JWT token that is scoped to a specific client device. On the servier side application you would typically query the source database directly and fetch the IDs required in your sync rules to satisfy the conditions of the parameter queries. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the token scoped to a device or a user? |
||
|
|
||
| For example, if you have sync rules that look like this: | ||
|
|
||
| ```yaml | ||
| sync_rules: | ||
| content: | | ||
| bucket_definitions: | ||
| store_products: | ||
| parameters: SELECT id as store_id FROM stores WHERE id = request.user_id() | ||
| data: | ||
| - SELECT * FROM stores WHERE id = bucket.store_id | ||
| - SELECT * FROM products WHERE store_id = bucket.store_id | ||
| ``` | ||
| You would want to query the source database for the `store_id` and use it to generate a JWT token that is scoped to a specific client device. | ||
|
|
||
| ## Pre-seeding script | ||
|
|
||
| A simple script to pre-seed the SQLite database would look like this: | ||
|
|
||
| ```typescript | ||
| async function prepareDatabase (storeId: string) { | ||
| const connector = new Connector(); | ||
| await powersync.connect(connector); | ||
| await powersync.waitForFirstSync(); | ||
| const result = await powersync.execute("DELETE FROM ps_kv WHERE key = ?", ["client_id"]); | ||
| const backupPath = `/path/to/sqlite/${storeId}.sqlite`; | ||
|
|
||
| const vacuumResult = await powersync.execute(`VACUUM INTO ${backupPath}`); | ||
| await uploadFile(storeId, `${storeId}.sqlite`, backupPath); | ||
|
|
||
| await powersync.close(); | ||
| await powersync.disconnect(); | ||
| } | ||
| ``` | ||
| <Note> | ||
| Some critical points to note: | ||
| - You will need to wait for the first sync to complete before deleting the `client_id` key and vacuuming the database. This makes sure all of the data is synced to the database before we proceed. | ||
| - The `client_id` key is used to identify the client device and is typically set when the client connects to the PowerSync instance. So when pre-seeding the database, we need to delete the `client_id` key to avoid conflicts when the client connects to the PowerSync instance. | ||
| - It's important to note that you will need to use the [`VACUUM INTO`](https://sqlite.org/lang_vacuum.html) command to create a clean, portable SQLite database file. This will help to reduce the size of the database file and provide an optimized database file for the client to download. | ||
| - In this example the upload function is using AWS S3, but you can use any blob storage provider that you prefer. | ||
| </Note> | ||
|
|
||
|
|
||
| ### Scheduling and Cleaning Up | ||
|
|
||
| To enhance the process you can consider doing the following: | ||
| - To keep the pre-seeded SQLite databases fresh, scheudle a CRON jobs for periodic regeneration, ensuring that new clients always download the latest snapshot of the intial syncdata. | ||
|
Check warning on line 73 in usage/use-case-examples/pre-seeded-sqlite.mdx
|
||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. multiple typos |
||
| - After each run, perform some environment cleanup to avoid disk bloat. This can be done by deleting the pre-seeded SQLite database files after they have been uploaded to the blob storage. | ||
|
|
||
| ## Client Side Usage | ||
|
|
||
| When the client applicaitons boot, before connecting to the PowerSync instance, check if the pre-seeded SQLite database exists in the blob storage. If it does, download it and use when initializing the PowerSyncDatabase class. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo |
||
|
|
||
| <Warning> | ||
| It's important to note that when the client downloads the pre-seeded SQLite database that it's stored in a permanent location on the device. This means that the database will not be deleted when the app is uninstalled or restarted. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you uninstall from iOS all app data will be deleted, this is not possible. I think this only caters for crashes/restarts, not uninstall |
||
| Depending on which PowerSync SDK you are using, you may need to use framework specific methods to store the file in a permanent location on the device. For example, in React Native + Expo you can use the [`expo-file-system`](https://docs.expo.dev/versions/latest/sdk/filesystem/) module to store the file in a permanent location on the device. | ||
| </Warning> | ||
|
|
||
| Once the database is downloaded, insert a new client_id key into the ps_kv table and connect to the PowerSync instance. | ||
|
Check warning on line 85 in usage/use-case-examples/pre-seeded-sqlite.mdx
|
||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would be good to use backtick formatting for client_id and ps_kv |
||
|
|
||
| ```typescript | ||
| async function configureDatabase() { | ||
| // Call init() first, this will ensure the database is initialized, but not connected to the PowerSync instance. | ||
| await powersync.init(); | ||
| await powersync.execute("INSERT INTO ps_kv (key, value) VALUES (?, ?)", ["client_id", "1234567890"]); | ||
| await powersync.connect(connector); | ||
| } | ||
| ``` | ||
|
|
||
| <Tip> | ||
| It's important that you insert a new client_id key into the ps_kv table to avoid conflicts when the client connects to the PowerSync instance. | ||
|
Check warning on line 97 in usage/use-case-examples/pre-seeded-sqlite.mdx
|
||
| </Tip> | ||
|
|
||
| At this point the client should be able to connect to the PowerSync instance and sync the data as normal, bypassing the initial sync process. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggest rephrasing to make this more clear. what does "as normal" mean? it means resume syncing from where the snapshot was created and where the service operation history is at, right? |
||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the use case examples are roughly ordered alphabetically, with data pipelines at the bottom since it's enterprise only. could you put this between postgis and prioritized sync?