Skip to content

Commit f7a09e3

Browse files
committed
Enhance SQL performance guidance in migration script
- Added detailed recommendations for using primary and foreign keys, including examples for product and sales tables. - Introduced best practices for indexing, query performance debugging, and database maintenance commands. - Included advice on lazy loading and reducing the number of queries to optimize SQLPage applications.
1 parent aab67ce commit f7a09e3

File tree

1 file changed

+149
-5
lines changed

1 file changed

+149
-5
lines changed

examples/official-site/sqlpage/migrations/69_blog_performance_guide.sql

Lines changed: 149 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -33,15 +33,45 @@ Your database schema should be [normalized](https://en.wikipedia.org/wiki/Databa
3333
one piece of information should be stored in only one place in the database.
3434
This is a good practice that will not only make your queries faster,
3535
but also make it impossible to store incoherent data.
36+
You should use meaningful natural [primary keys](https://en.wikipedia.org/wiki/Primary_key) for your tables
37+
and resort to surrogate keys (such as auto-incremented integer ids) only when the data is not naturally keyed.
38+
Relationships between tables should be explicitly represented by [foreign keys](https://en.wikipedia.org/wiki/Foreign_key).
3639
37-
For instance, if you are modelling sales that happen in stores, the sales table should
38-
contain a foreign key to another table named stores.
39-
It should not contain the full store name.
40+
```sql
41+
-- Products table, naturally keyed by catalog_number
42+
CREATE TABLE product (
43+
catalog_number VARCHAR(20) PRIMARY KEY,
44+
name TEXT NOT NULL,
45+
price DECIMAL(10,2) NOT NULL
46+
);
47+
48+
-- Sales table: natural key = (sale_date, store_id, transaction_number)
49+
-- composite primary key used since no single natural attribute alone uniquely identifies a sale
50+
CREATE TABLE sale (
51+
sale_date DATE NOT NULL,
52+
store_id VARCHAR(10) NOT NULL,
53+
transaction_number INT NOT NULL,
54+
product_catalog_number VARCHAR(20) NOT NULL,
55+
quantity INT NOT NULL CHECK (quantity > 0),
56+
PRIMARY KEY (sale_date, store_id, transaction_number),
57+
FOREIGN KEY (product_catalog_number) REFERENCES product(catalog_number),
58+
FOREIGN KEY (store_id) REFERENCES store(store_id)
59+
);
60+
```
61+
62+
Always use foreign keys instead of trying to store redundant data such as store names in the sales table.
4063
4164
This way, when you need to display the list of stores in your application, you don''t have to
4265
run a slow `select distinct store from sales`, that would have to go through your millions of sales
4366
(*even if you have an index on the store column*), you just query the tiny `stores` table directly.
4467
68+
You also need to use the right [data types](https://en.wikipedia.org/wiki/Data_type) for your columns,
69+
otherwise you will waste a lot of space and time converting data at query time.
70+
See [postgreSQL data types](https://www.postgresql.org/docs/current/datatype.html),
71+
[MySQL data types](https://dev.mysql.com/doc/refman/8.0/en/data-types.html),
72+
[Microsoft SQL Server data types](https://learn.microsoft.com/en-us/sql/t-sql/data-types/data-types-transact-sql?view=sql-server-ver16),
73+
[SQLite data types](https://www.sqlite.org/datatype3.html).
74+
4575
[Denormalization](https://en.wikipedia.org/wiki/Denormalization) can be introduced
4676
only after you have already normalized your data, and is often not required at all.
4777
@@ -89,8 +119,56 @@ group by store_name;
89119
90120
### Use database indices
91121
122+
When a query on a large table uses non-primary column in a `WHERE`, `GROUP BY`, `ORDER BY`, or `JOIN`,
123+
you should create an [index](https://en.wikipedia.org/wiki/Database_index) on that column.
124+
When multiple columns are used in the query, you should create a composite index on those columns.
125+
When creating a composite index, the order of the columns is important.
126+
The most frequently used columns should be first.
127+
128+
```sql
129+
create index idx_sales_store_date on sale (store_id, sale_date); -- useful for queries that filter by "store" or by "store and date"
130+
create index idx_sales_product_date on sale (product_id, sale_date);
131+
create index idx_sales_store_product_date on sale (store_id, product_id, sale_date);
132+
```
133+
134+
Indexes are updated automatically when the table is modified.
135+
They slow down the insertion and deletion of rows in the table,
136+
but speed up the retrieval of rows in queries that use the indexed columns.
137+
92138
### Query performance debugging
93139
140+
When a query is slow, you can use the `EXPLAIN` keyword to see how the database will execute the query.
141+
Just add `EXPLAIN` before the query you want to analyze.
142+
143+
On PostgreSQL, you can use a tool like [explain.dalibo.com](https://explain.dalibo.com/) to visualize the query plan.
144+
145+
What to look for:
146+
- Are indexes used? You should see references to the indices you created.
147+
- Are full table scans used? Large tables should never be scanned.
148+
- Are expensive operations used? Such as sorting, hashing, bitmap index scans, etc.
149+
- Are operations happening in the order you expected them to? Filtering large tables should come first.
150+
151+
### Vacuum your database regularly
152+
153+
On PostgreSQL, you can use the [`VACUUM`](https://www.postgresql.org/docs/current/sql-vacuum.html) command to garbage-collect and analyze a database.
154+
155+
On MySQL, you can use the [`OPTIMIZE TABLE`](https://dev.mysql.com/doc/refman/8.0/en/optimize-table.html) command to reorganize it on disk and make it faster.
156+
On Microsoft SQL Server, you can use the [`DBCC DBREINDEX`](https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-dbreindex-transact-sql?view=sql-server-ver17) command to rebuild the indexes.
157+
On SQLite, you can use the [`VACUUM`](https://www.sqlite.org/lang_vacuum.html) command to garbage-collect and analyze the database.
158+
159+
### Use the right database engine
160+
161+
If the amount of data you are working with is very large, does not change frequently, and you need to run complex queries on it,
162+
you could use a specialized analytical database such as [ClickHouse](https://clickhouse.com/) or [DuckDB](https://duckdb.org/).
163+
Such databases can be used with SQLPage by using their [ODBC](https://en.wikipedia.org/wiki/Open_Database_Connectivity) drivers.
164+
165+
### Database-specific performance recommendations
166+
167+
- [PostgreSQL "Performance Tips"](https://www.postgresql.org/docs/current/performance-tips.html)
168+
- [MySQL optimization guide](https://dev.mysql.com/doc/refman/8.0/en/optimization.html)
169+
- [Microsoft SQL Server "Monitor and Tune for Performance"](https://learn.microsoft.com/en-us/sql/relational-databases/performance/monitor-and-tune-for-performance?view=sql-server-ver17)
170+
- [SQLite query optimizer overview](https://www.sqlite.org/optoverview.html)
171+
94172
## SQLPage-specific advice
95173
96174
The best way to make your SQLPage webapp fast is to make your queries fast.
@@ -107,17 +185,83 @@ So, if you have to execute a slow query, put it as far down in the page as possi
107185
108186
Every user-facing page in a SQLPage site has a [shell](/components?component=shell).
109187
110-
The first queries in any sql file (all the ones that come before the [])
188+
The first queries in any sql file (all the ones that come before the [shell](/components?component=shell))
189+
are executed before any data has been sent to the user''s browser.
190+
During that time, the user will see a blank screen.
191+
So, ensure your shell comes as early as possible, and does not require any heavy computation.
192+
If you can make your shell entirely static (independent of the database), do so,
193+
and it will be rendered before SQLPage even finishes acquiring a database connection.
111194
112195
#### Set variables just above their first usage
113196
197+
For the reasons explained above, you should avoid defining all variables at the top of your sql file.
198+
Instead, define them just above their first usage.
199+
114200
### Avoid recomputing the same data multiple times
115201
202+
Often, a single page will require the same pieces of data in multiple places.
203+
In this case, avoid recomputing it on every use inside the page.
204+
205+
#### Reusing a single database record
206+
207+
When that data is small, store it in a sqlpage variable as JSON and then
208+
extract the data you need using [json operations](/blog.sql?post=JSON%20in%20SQL%3A%20A%20Comprehensive%20Guide).
209+
210+
```sql
211+
set product = (
212+
select json_object(''name'', name, ''price'', price) -- in postgres, you can simply use row_to_json(product)
213+
from products where id = $product_id
214+
);
215+
216+
select ''alert'' as component, ''Product'' as title, $product->>''name'' as description;
217+
```
218+
219+
#### Reusing a large query result set
220+
221+
You may have a page that lets the user filter a large dataset by many different criteria,
222+
and then displays multiple charts and tables based on the filtered data.
223+
224+
In this case, store the filtered data in a temporary table and then reuse it in multiple places.
225+
226+
```sql
227+
drop table if exists filtered_products;
228+
create temporary table filtered_products as
229+
select * from products where
230+
($category is null or category = $category) and
231+
($manufacturer is null or manufacturer = $manufacturer);
232+
233+
select ''alert'' as component, count(*) || '' products'' as title
234+
from filtered_products;
235+
236+
select ''list'' as component;
237+
select name as title from filtered_products;
238+
```
239+
116240
### Reduce the number of queries
117241
242+
Each query you execute has an overhead of at least the time it takes to send a packet back and forth
243+
between SQLPage and the database.
244+
When it''s possible, combine multiple queries into a single one, possibly using
245+
[`UNION ALL`](https://en.wikipedia.org/wiki/Set_operations_(SQL)#UNION_operator).
246+
247+
```sql
248+
select ''big_number'' as component;
249+
250+
with stats as (
251+
select count(*) as total, avg(price) as average_price from filtered_products
252+
)
253+
select ''count'' as title, stats.total as value from stats
254+
union all
255+
select ''average price'' as title, stats.average_price as value from stats;
256+
```
257+
118258
### Lazy loading
119259
120-
Use the card and modal components to load data lazily.
260+
Use the [card](/component?component=card) and [modal](/component?component=modal) components
261+
with the `embed` attribute to load data lazily.
262+
Lazy loaded content is not sent to the user''s browser when the page initially loads,
263+
so it does not block the initial rendering of the page and provides a better experience for
264+
data that might be slow to load.
121265
122266
### Database connections
123267

0 commit comments

Comments
 (0)