Postgresql: duplicate schemas automatically
|
After this comment to a my question, im thinking if is better using 1 database with X schemas or viceversa. My situation: im developing a web-app where, when people do register, i create (actually) a database (no, its not a social network: everyone must have access to his own data and never see the data of the other user). Thats the way i used for the previus verison of my application (that is still running on mysql): throught the plesk api, for every registration i do:
Now, i'll need to do the same with posrtgresql (the project is getting mature and mysql.. dont fulfill all the needes) I need to have all the databases/schemas backups indipendent: pg_dump works perfectly in both ways, the same for the users that can be configured to access just 1 schema or 1 database. So, assuming you are more experienced potsgres users than me, what do you think is the best solution for my situation, and why? There will be performance differences using $x db instead of $x schemas? And what solution will be better to mantein in future (reliability)? Every help and suggestion is really appreciated. Edit: i almost forgot: all of my databases/schemas will allways have the same structure! Edit2: For the backups issue (using pg_dump), is maybe better using 1 db and many schemas, dumping all the schemas at once: recovering will be quite simple loading the main dump in a dev machine and then dump and restore just the schema needed: there is 1 additional step, but dumping all the schema seem faster then dumpin them one by one. p.s: sorry if i forgot some 'W' char in the text, my keyboard suffer that button ;) |
||||||||
|
|
|
A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database. |
||||||||
|
|
|
I would say, go with multiple databases AND multiple schemas :) Schemas in postgres are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities. For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences etc. that are needed for the user management. Databases are entire programs, schemas are components. |
||
|
|
|
A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this. But if you really want to keep things very separate (instead of refactoring the web application so that a "costomer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers. |
||
|
|
Definitely, i'll go for the 1-db-many-schemas approach. This let me to dump all the database but restore just 1 very easly, in many ways:
Otherwise, googlin around iv'se seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:
I've written 2 rows in python to do that, hope thay can help someone (in-2-seconds-written-code, dont use it in production): #Take the ne shcema name from the second cmd arguments (the first is the filename) newSchema = sys.argv[1] #Temp folder for the dumps #Settings db_pass = 'db_pass'
#Rename schema with the new name #Dump it os.system(command) pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template) #Restore the previus dump to create the new schema os.system(restore) os.remove(dumpFile) pgConnect.close() |

