|
Mistakes SQL most frequent |
|
|
|
|
Thursday, 05 June 2008 |
1. The names of objects SQL I have given explicit names to my tables and columns, and I am encountering problems ... Why? SQL is a language based on a standard. The names of SQL objects have a specific construction standards. In particular, it may not use the 26 letters of the alphabet, ten digits and the character "underscore" (underscore). Any other character is forbidden to appoint an SQL object (table, column view, user, coercion ...). Some RDBMS allow anything like characters in the name of an object ... alas! The maximum number of characters is 128, but some have limits RDBMS smaller as Oracle. It therefore banned the "white", diacriticals (accents, cedilla, etc.).. In principle, insensitive (ie uppercase and lowercase) is not important in the appointment of objects. However, some RDBMS are sensitive to the name of SQL objects under certain conditions. That is particularly true of SQL Server, if one parameter to the installation CASE SENSITIVE. It is very dangerous to give names of objects with white or accents. Why? Because some program modules do not work on the same codes of pages. At a time when databases are increasingly integrated into the Internet, multiple interfaces treatment makes it difficult or impossible interaction of the code, if the names of SQL objects have not been defined properly. In addition, if the name contains illegal characters (what some RDBMS allow as Access or SQL Server) then we must ruser using specific expressions, and treatments arising take more time than if the names had been properly defined. The standard in this area Although write ... Here are some rules to write your code SQL ... First, always write your SQL code capitalized. Thus the names of objects will be written for instance: CREATE TABLE CLIENT (CLI_NUM INTEGER NOT NULL PRIMARY KEY, CLI_NOM VARCHAR (32)) SELECT * FROM CLIENT WHERE CLI_NOM LIKE '% OF T' Open any book dealing with computer SQL. You'll always SQL code capitalized. It is a habit acquired for a very long time. She became a genuine de facto standard. Why? Because when you mix of client code (Cobol, C, VB, Delphi, PHP, C #, Java ...) and SQL code, it is difficult to find if we can not easily distinguish l 'one another. Thus the habit was taken to write the code "client" and a tiny code "server" capitalized. This allows a distinction effective and immediate elements of code running on the seveur and those running locally. Example: With dmSys.QAdoGenericUpdate do Begin If active then close; SQL.clear; SQL.add ( 'UPDATE TS_PARAMETRES_BASE_PRB SET PRB_VALEUR =' + QuotedStr (ParaVal) + 'WHERE PRB_NOM =' + QuotedStr (ParaName)); execSQL; end; Then indentez your SQL code! Indeed SQL can become difficult to read if not properly presented. Here are the same for comparison written request malproprement and its twin written with indentation ... Which one would know you read and understand more quickly? SELECT DISTINCT VILLE_ETP FROM T_ENTREPOT WHERE RAYON_RYN IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELECT RAYON_RYN FROM T_RAYON)) ) GROUP BY VILLE_ETP HAVING COUNT (*) = (SELECT COUNT(DISTINCT RAYON_RYN) FROM T_RAYON) | SELECT DISTINCT VILLE_ETP FROM T_ENTREPOT WHERE RAYON_RYN IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELECT RAYON_RYN FROM T_ENTREPOT WHERE RAYON_RYN NOT IN (SELEC T RAYON_RYN FROM T_RAYON))) GROUP BY VILLE_ETP HAVING COUNT (*) = (SELECT COUNT(DISTINCT RAYON_RYN) FROM T_RAYON) | The basic rule on the matter is: 1. all applications must be a homogeneous block lines indented the same origin 2. each new clause of a complaint must begin on line 3. when a clause is composed of several distinct elements, write them on different lines 4. develop the FROM clause in the form of a tree with a line for the table and a line to join the clause 5. not to exceed 70 to 80 characters per line Example SELECT ITM.ITM_ID, PRG.PRG_ID, PRG_TEXTE, TPG_LIBELLE, CASE WHEN TPG.TPG_ID = 1 THEN '' + COALESCE (CAST (PRG_ORDRE AS VARCHAR (32 )),'') ELSE'' AS END TEXT PRG_ORDRE FROM T_ITEM_ITM ITM LEFT OUTER JOIN T_PARAGRAPHE_PRG PRG ON ITM.ITM_ID = PRG.ITM_ID LEFT OUTER JOIN TR_TYPE_PARAGRAPHE_TPG TPG ON PRG.TPG_ID = TPG.TPG_ID LEFT OUTER JOIN ITS TR_STYLE_ITEM_STI ON ITM.STI_ID = STI.STI_ID BY ORDER ITM_BG, PRG_ORDRE Here join the tree is: Table T_ITEM_ITM => root of the tree Table T_PARAGRAPHE_PRG => branch from table root T_ITEM_ITM Table TR_TYPE_PARAGRAPHE_TPG => branch since T_PARAGRAPHE_PRG table (hence double indentation) Table TR_STYLE_ITEM_STI => branch from table root T_ITEM_ITM If the heart tells you, do not hesitate to adopt a standard coding objects of your base. In this case, read this: Standardization names of database objects The names of objects are sensitive? The standard SQL said that in principle, the names of objects SQL (table name, sight, column ...) are not sensitive. Thus T_Client and T_CLIENT are a single object. But this rule has an exception. Indeed, when using an identifier SQL as a noun, the name becomes sensitive. For example, if I want to use the keyword TYPE column name as a table, I must surround the quotation marks because that word is reserved. Therefore, this column name becomes sensitive. Example: CREATE MA_TABLE ("TYPE" INTEGER, "Type" FLOAT, "TYpe" CHAR(1)) It is an order of creation SQL table quite valid! ATTENTION: as the names of objects are stored in tables of the database known as "table scheme" or "dictionary" or table "system", some RDBMS in certain configurations are sensitive to the names of objects. One example of SQL Server, which at its facility can be configured to be sensitive ...
2. Terminology SQL Misconception: "field" and "registration" do not exist in the databases! No more than "rights" ... CHAMP: this concept does not exist in databases as a field is a visual element (the operative field surgeon, pilot's visual field, fields in an entry interface ...) In database is called "column" because the tables are viewed in tabular form in which there are rows and columns ...
REGISTRATION: This notion does not exist in databases as a registration read a file line by line, while (except for the basics-type file) all tables in the same basis are stored in a single and same file by granules storage unit called "pages". It speaks of a line which is a virtual concept. RIGHTS: rights do not exist in SQL. Indeed rights assume a mode of access to a file. But databases and SQL manage various modes that are not necessarily something to do with the reads and writes data. Thus SQL can define user privileges on objects of databases that are neither tables or views, or columns, or data, but design elements of the database ... That is why talk about rights is ... absurd and reserved the concepts of "systems" based on files and resources. In SQL we speak of privileges that were granted or dismiss.
3. NULL is not a value! I can not find my zero values. I do SELECT ... WHERE MaColonne = NULL and it does not work! NULL: this element of SQL is not a value, since it is precisely the absence of value. In fact NULL is a marker as NIL programming which indicates that after pointer there is no resource. For example compare a column to NULL makes no sense. To address markers NULL should be used as special predicates IS NULL, IS NOT NULL or functions as COALESCE or NULLIF or using the CASE structure.
To read about the processors NULL values: Operators treatment markers NULL
4. CASE SENSITIVE A false good idea is to establish a database (or some tables, columns) with a snack that makes the strings insensitive (over differences between upper and lower case). It is a heresy that use such a default behavior. Indeed so it is easy in the complaints out of breaks in the comparisons, so it is very difficult if not impossible to make elements sensitive when they were returned CASE INSENSITIVE! To make your comparisons insensitive, you just have to play with the function or UPPER LOWER present in all RDBMS.
Example: SELECT * FROM CLIENT WHERE UPPER(CLI_NOM) LIKE 'DU%T' But how to distinguish between "Smith" and "DUPONT" if your RDBMS was made insensitive? The solution is, if your RDBMS supports, columns transtyper chains binary: Example: SELECT * FROM CLIENT WHERE CAST(CLI_NOM AS VARBINARY(32)) = CAST('DUPONT' AS VARBINARY(32)) This code is obviously immeasurably longer to execute that if your RDBMS had been made sensitive! In fact, do you know why we talk capitalization to distinguish between upper and lower case? This terminology has existed since ancient times. Some time or printers had at their disposal to make matrices printing a large "bac" divided into many cases in which there were capital letters in boxes rows upper and lower boxes in rows below. As they used more frequently tiny, they were ordered down so what are more accessible to the user. This large tray, is called a break, and this storage technique is ergonomics. ! So, bottom to top the tiny capital letters, where expressions lower case (tiny) and upper case (capital letters). In short, information is not invented in this area ...
On this subject, see the problems of pages of characters and snacks: A Question of Character ...
5. The last ... An issue that often in discussions is to find the last line inserted into a table. This question makes no sense in the world of databases, because the latter have no established order. A table is a bag of ball. If you receive my bag ball, have you any means of knowing what was the last ball that I filled? The RDBMS are used to minimize the costs of managing the input / output disc. In other words, when I delete a line, that is not cleared or compacted. It remains a dead space. If a new entry arrives, it is very likely that your RDBMS will put the new line to insert!
Example: CREATE TABLE TEST (MA_COLONNE VARCHAR(32))
INSERT INTO TEST VALUES ('abc') INSERT INTO TEST VALUES ('def') INSERT INTO TEST VALUES ('ghi')
DELETE TEST WHERE MA_COLONNE = 'def'
INSERT INTO TEST VALUES ('jkl')
SELECT * FROM TEST | MA_COLONNE ---------- abc jkl ghi | You probably expect that the line "jkl" is the last! Error, RDBMS recovered instead of the line "def" to put jkl! There is no notion of order in the RDBMS, or order lines, or order of columns, except that the user wants to see included in the data by adding a special column for sorting. At most, if a column has given whose value is always higher than those already included (for example a column auto incremented) so we can find the maximum value (thus the last line inserted) by a request for calculating aggregate as SELECT MAX (... But if we need to be assured of the sequence in time, should be added to table a column containing a definition type TIMESTAMP (combined date time) with the default server as a function and CURRENT_TIMESTAMP that in the definition of the table.
Example: CREATE MaTable (ID INT, DT TIMESTAMP DEFAULT CURRENT_TIMESTAMP) But this practice is dangerous because during insertion mass accuracy of this date combined time can generate data indentiques ... That is why some RDBMS have added to their definition, a particular type as the time stamp (for example TIMESTAMP for MS SQL Server) ensuring always have a column in which all values are distinct and growing according to the time of the insertion. Finally, in the case of calculations incrementées car keys, the use of SELLECT MAX (...) +1 is a aberation which can only lead the basis for violations of constraints on primary keys as a result of competition. A read on the subject: Keys self-incrementees
6. Position ... Other problems, add a column to position, for example between the column x and y column of the table ... Other utopia of course because there is no order in the databases (see above). By order on the tables, or order on the columns within the table. Although ... ! All things considered standard SQL present in the schema information a column entitled "ORDINAL_POSITION" which specifies the place of the column within the list of columns of the table and this place is generally defined as at the creation. Any new column is added has the highest position incremented by one. This order ordinal is used mainly when inserting data (INSERT) when one does not list columns. Another feature: the ability to define the columns in the clause tri (ORDER BY) by their positional order in the SELECT clause and not by name ... The question beast which is constantly is: how to insert a column Z column exactly between X and Y column? Simply by recreating the table. But this method, which offers no certainty is cumbersome to use because it is necessary to preserve the data in a temporary table before destroying the table to recreate it in its new architecture. Besides SQL offers no certainty as to order restitution columns where, instead explicitly specify the columns I use the wildcard *! But if you obstinez want to put your new column to position then visit: Change the name or type of a column
7. Format Date ... The concept of date format does not exist in SQL or in the RDBMS. The date is stored as a type DATE, which most of the time is a whole and for TIMESTAMP (DATETIME) a reality. The whole part is the number of days elapsed since a certain date. As in SQL Server date "origin" is 1 January 1900: SELECT CAST(0 AS DATETIME) AS DATE_TEST | DATE_TEST ----------------------- 1900-01-01 00:00:00.000 | The fractional part in the elapsed time of 24 hours. For example 0.5 means the 1st January 1900 at noon for SQL Server. But how to specify a date with a literal expression as 21/01/2003 16h18m? It should write the date in the literal and transtyper a DATE or DATETIME. The trsantypage may be implicit or explicit: Examples: SELECT CAST('21/01/2003 12:23' AS DATETIME) AS DATE_TEST => conversion explicit INSERT INTO (ID, DATE_FIN) VALUES (33, '21/01/2003 12:23') =>conversion implied because the column is DATE_FIN type DATETIME But in what format conversion will she make? Most RDBMS based on the ISO standard format date, which specifies: YYYY-MM-DD hh: mm: ss.xxx Example: SELECT CAST ('2003-01-21 12:33:00.000 'AS DATETIME) AS DATE_TEST But some suggest RDBMS can freely specify the format with which one wants to work. For example, SQL Server we must position the flag DATEFORMAT with the required setting. This flag may take the values: YMD, YDM, MDY, MYD, DYM, DMY. Here's how under SQL Server on specific dates the use of ISO: SET DATEFORMAT YMD SELECT CAST('2003-01-21 12:33:00.000' AS DATETIME) AS DATE_TEST And voila! For other issues of date, please refer to the article on the management of schedules: Timing, timing and schedules to SQL ...
8. Duplicate 8.1. DISTINCTROW or a perfect example of the absurdity of Access! The quidams passing Access to a RDBMS that respects the standard SQL (SQL as a language is strongly Standard: 1986, 1992, 1999, 2003!) Does not understand that the operator does not exist in DISTINCTROW SQL ... Rapellons how is this crazy keyword: "DISTINCTROW" refers queries that use at least one knuckle. Sometimes the resulting tuples provide data doublonnées in the subset of columns of the table mother. To prevent this Access invites us to use the word DISTINCTROW. An example is always easier to understand are the elements of our game test: Either a table containing T_PERSONNE_PRS "parents": PRS_ID PRS_NOM PRS_DATE_NAISSANCE --------- -------------- -------------------- 1 DUPONT 21/05/1960 2 DUPONT 12/07/1972 3 DUVAL 16/11/1980 and a table of children T_ENFANT_ENF: ENF_ID PRS_ID ENF_PRENOM --------- -------------- -------------------- 48 1 Camille 57 1 Alain 63 2 Marcel 78 3 Raoul Here's what gives the motion which "makes flat" information: SELECT PRS.PRS_ID, PRS.PRS_NOM, PRS.PRS_DATE_NAISSANCE, ENF.ENF_ID, ENF.ENF_PRENOM FROM T_PERSONNE_PRS PRS INNER JOIN T_ENFANT_ENF ENF ON PRS.PRS_ID = ENF.PRS_ID | PRS_ID PRS_NOM PRS_DATE_NAISSANCE ENF_ID ENF_PRENOM --------- -------------- ------------------ ---------- ---------- 1 DUPONT 21/05/1960 48 Camille 1 DUPONT 21/05/1960 57 Alain 2 DUPONT 12/07/1972 63 Marcel 3 DUVAL 16/11/1980 78 Raoul | Now a query that gives an overview partial data attached: SELECT PRS.PRS_NOM FROM T_PERSONNE_PRS PRS INNER JOIN T_ENFANT_ENF ENF ON PRS.PRS_ID = ENF.PRS_ID | PRS_NOM -------------- DUPONT DUPONT DUPONT DUVAL | As you can see, the name appears three times DUPONT, which has no interest because it is impossible to know which line containing the name DUPONT is the key person of 1 or 2 or the child of key 48 , 57 or 63 ... With the key word DISTINCT (the standard SQL), we get: SELECT DISTINCT PRS.PRS_NOM FROM T_PERSONNE_PRS PRS INNER JOIN T_ENFANT_ENF ENF ON PRS.PRS_ID = ENF.PRS_ID | PRS_NOM -------------- DUPONT DUVAL | The redundant lines have been eliminated, and so much the better! But with the word DISTINCTROW Access, a surprise awaits us ... SELECT DISTINCTROW PRS.PRS_NOM FROM T_PERSONNE_PRS PRS INNER JOIN T_ENFANT_ENF ENF ON PRS.PRS_ID = ENF.PRS_ID | PRS_NOM -------------- DUPONT DUPONT DUVAL | But how this nonsense is it possible? I ask you! Simply because there are two DUPONT, one ID 1 and 2 other ID and that is why you end up with twice DUPONT, without knowing the result of reading how DUPONT is the ID 1 or 2! What's the point? A nothing, if not to confuse the developer because the operator violates the very foundations of logic ensembliste upon which the foundations of databases.
8.2. Duplicate part However, an issue that often is able dedoublonner partly a set of data. This request cache often a misunderstanding of the overall logic ensembliste databases and a lack of SQL in particular. To better understand this request, we will create our data set: Either the T_MACHINE_MCH table and the table T_INCIDENT_ICD which historise incidents on different machines: CREATE TABLE T_MACHINE_MCH (MCH_ID INTEGER NOT NULL PRIMARY KEY, MCH_NOM VARCHAR(16)) | INSERT INTO T_MACHINE_MCH VALUES (1,'Airplane') INSERT INTO T_MACHINE_MCH VALUES (2,'Bicycle') | CREATE TABLE T_INCIDENT_ICD (ICD_ID INTEGER NOT NULL PRIMARY KEY, MCH_ID INTEGER NOT NULL FOREIGN KEY REFERENCES T_MACHINE_MCH (MCH_ID), ICD_NATURE VARCHAR(16), ICD_DATEHEURE TIMESTAMP) | INSERT INTO T_INCIDENT_ICD VALUES (63, 1, ruling reactor ','2002-11-17 21:58:23') INSERT INTO T_INCIDENT_ICD VALUES (78, 1, 'Panne APU','2002-11-17 22:03:21 ') INSERT INTO T_INCIDENT_ICD VALUES (79, 2, 'Crevaison','2003-02-27 09:11:57 ') INSERT INTO T_INCIDENT_ICD VALUES (82, 2, 'Casse string','2003-03-04 11:22:33 ') INSERT INTO T_INCIDENT_ICD VALUES (87, 2, 'Wheel veiled','2003-09-01 17:14:33 ') | Our quidam would recover the last incident that took place (in chronological order) on every machine, but more importantly, it would retrieve the ID of this latest incident, in order to be able to fully identify the data concerning these incidents. Here is the junction of data between two tables: SELECT MCH.MCH_ID, MCH_NOM, ICD_ID, ICD_NATURE, ICD_DATEHEURE FROM T_MACHINE_MCH MCH INNER JOIN T_INCIDENT_ICD ICD ON MCH.MCH_ID = ICD.MCH_ID | MCH_ID MCH_NOM ICD_ID ICD_NATURE ICD_DATEHEURE ----------- ---------------- ----------- ---------------- ----------------------- 1 Avion 63 Judgement reactor 2002-11-17 21:58:23.000 1 Avion 78 Panne APU 2002-11-17 22:03:21.000 2 Cycling 79 Crevaison 2003-02-27 09:11:57.000 2 Cycling 82 Casse string 2003-03-04 11:22:33.000 2 Cycling 87 Wheel veiled 2003-01-09 17:14:33.000 | What our quidam wants to get is the following answer: MCH_ID MCH_NOM ICD_ID ICD_NATURE ICD_DATEHEURE ----------- ---------------- ----------- ---------------- ----------------------- 1 Avion 78 Panne APU 2002-11-17 22:03:21.000 2 Cycling 82 Casse string 2003-03-04 11:22:33.000 In other words it would dedoublonner on columns MCH_ID and MCH_NOM in the obtennant MAX ICD_DATEHEURE and data ICD_ID / ICD_NATURE ... Something like: SELECT DISTINCT(MCH.MCH_ID, MCH_NOM), ICD_ID, ICD_NATURE, MAX(ICD_DATEHEURE) FROM T_MACHINE_MCH MCH INNER JOIN T_INCIDENT_ICD ICD ON MCH.MCH_ID = ICD.MCH_ID | MCH_ID MCH_NOM ICD_ID ICD_NATURE ICD_DATEHEURE ----------- ---------------- ----------- ---------------- ----------------------- 1 Avion 78 Panne APU 2002-11-17 22:03:21.000 2 Cycling 82 Casse string 2003-03-04 11:22:33.000 | But DISTINCT is not a function applicable to a column or a group of column. SEPARATE affects all the data from the resulting line. We must therefore proceed in stages:
Step No. 1: find the incidents with the MAX dateHeure for each machine: SELECT MAX(ICD_DATEHEURE) AS MAX_DH, MCH_ID FROM T_INCIDENT_ICD GROUP BY MCH_ID | MAX_DH MCH_ID ------------------------------------------------------ ------ 2002-11-17 22:03:21.000 1 2003-03-04 11:22:33.000 2 | Step No. 2: link this result with the incident referred to using a complaint under correlated on the ID machine: SELECT ICD_ID, MCH_ID, ICD_NATURE, ICD_DATEHEURE FROM T_INCIDENT_ICD ICD1 WHERE ICD_DATEHEURE = (SELECT MAX(ICD_DATEHEURE) FROM T_INCIDENT_ICD ICD2 WHERE ICD2.MCH_ID = ICD1.MCH_ID) | ICD_ID MCH_ID ICD_NATURE ICD_DATEHEURE ----------- ----------- ---------------- ----------------------- 82 2 Casse string 2003-03-04 11:22:33.000 78 1 Panne APU 2002-11-17 22:03:21.000 | Step No. 3: to extend this result-jointed to get information on the machines: SELECT ICD_ID, MCH.MCH_ID, ICD_NATURE, ICD_DATEHEURE, MCH_NOM FROM T_INCIDENT_ICD ICD1 INNER JOIN T_MACHINE_MCH MCH ON ICD1.MCH_ID = MCH.MCH_ID WHERE ICD_DATEHEURE = (SELECT MAX(ICD_DATEHEURE) FROM T_INCIDENT_ICD ICD2 WHERE ICD2.MCH_ID = ICD1.MCH_ID) | ICD_ID MCH_ID ICD_NATURE ICD_DATEHEURE MCH_NOM ----------- ----------- ---------------- ---------------------------- ------- 82 2 Casse string 2003-03-04 11:22:33.000 Cycling 78 1 Panne APU 2002-11-17 22:03:21.000 Avion | And that's our problem solved. Hard to do that without complaints under ... 9. Cosmetics ... Let me present my data in such a way. How to do that with SQL? An RDBMS is certainly not done to make "cosmetic"! The cosmetics (common name) means "which is unique to beauty" ... No, an RDBMS is not, and must never be used to treat the presentation of data. Once the data are there in the result of a request, then just pick where you want to build this or that bill "presentation online column, a table, cube, a tab , In relief, in Hebrew and in color, if you are pleased, but not with a RDBMS. All these treatments must be performed by the client server layer, ie, the executable program or an object of an application server for example. BUT NOT IN CODE SQL! The SQL is not made for that, and the RDBMS will prove ALWAYS a calf if you ask it to do this job and everyone is unhappy: you first of all, as has the long processing time will désatsreux, it then (RDBMS) because it s'emmerdera make a painting while in your orchestra it is the piano, and you ask him to play the triangle! The most classic is the following ... I would like my data in the columns to be called into line and all this in a nice request! Do not tell me ... So it must demonstrate the absurdity of such a thing, demonstrating there ... As an example we will focus the calculation of turnover per quarter depending on customers. Here is a list of commands: CREATE TABLE COMMANDE (CMD_ID INTEGER, CLI_ID INTEGER, CMD_DATE DATE, CMD_MONTANT FLOAT) | INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (33, 1, '2000-10-18', 1287.22) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (101, 1, '2001-01-15', 7854.12) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (102, 1, '2001-03-10', 11474.25) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (103, 1, '2001-11-01', 3587.00) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (104, 2, '2001-02-02', 114472.89) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (105, 2, '2001-02-25', 858.21) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (106, 2, '2001-01-15', 7854.12) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (107, 2, '2001-05-11', 82462.05) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (108, 2, '2001-07-01', 61458.00) INSERT INTO COMMANDE (CMD_ID, CLI_ID, CMD_DATE, CMD_MONTANT) VALUES (109, 2, '2001-12-01', 962.28) | CLI_ID being the key foreign referencing the customer. Our leadership seeks to obtain commercial sales figures and per customer per quarter for the year 2001 ... The application can be written: SELECT CLI_ID, CASE WHEN (EXTRACT(MONTH FROM CMD_DATE) = 1) OR (EXTRACT(MONTH FROM CMD_DATE) = 2) OR (EXTRACT(MONTH FROM CMD_DATE) = 3) THEN 1 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 4) OR (EXTRACT(MONTH FROM CMD_DATE) = 5) OR (EXTRACT(MONTH FROM CMD_DATE) = 6) THEN 2 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 7) OR (EXTRACT(MONTH FROM CMD_DATE) = 8) OR (EXTRACT(MONTH FROM CMD_DATE) = 9) THEN 3 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 10) OR (EXTRACT(MONTH FROM CMD_DATE) = 11) OR (EXTRACT(MONTH FROM CMD_DATE) = 12) THEN 4 END AS TRIMESTRE, SUM(CMD_MONTANT) AS CA FROM COMMANDE WHERE EXTRACT(YEAR FROM CMD_DATE) = 2001 GROUP BY CLI_ID, CASE WHEN (EXTRACT(MONTH FROM CMD_DATE) = 1) OR (EXTRACT(MONTH FROM CMD_DATE) = 2) OR (EXTRACT(MONTH FROM CMD_DATE) = 3) THEN 1 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 4) OR (EXTRACT(MONTH FROM CMD_DATE) = 5) OR (EXTRACT(MONTH FROM CMD_DATE) = 6) THEN 2 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 7) OR (EXTRACT(MONTH FROM CMD_DATE) = 8) OR (EXTRACT(MONTH FROM CMD_DATE) = 9) THEN 3 WHEN (EXTRACT(MONTH FROM CMD_DATE) = 10) OR (EXTRACT(MONTH FROM CMD_DATE) = 11) OR (EXTRACT(MONTH FROM CMD_DATE) = 12) THEN 4 END ORDER BY CLI_ID, TRIMESTRE | CLI_ID TRIMESTRE CA
1 1 19328.37 1 4 3587.0 2 1 123185.22 2 2 82462.05 2 3 61458.0 2 4 962.28 | Obviously, our department would prefer to have the results in this form: CLI_ID CA_TRIMESTRE_1 CA_TRIMESTRE_2 CA_TRIMESTRE_3 CA_TRIMESTRE_4 ------ -------------- -------------- -------------- -------------- 1 19328.37 3587.0 2 123185.22 82462.05 61458.0 962.28 This helps to realize that some quarters are at zero for some customers, which was not evident in the presentation from the response to our first complaint. Determine the turnover per customer for a quarter determined is not too hard to SQL. For example calculate it for the first quarter 2001: SELECT CLI_ID, SUM(CMD_MONTANT) AS CA_T1 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-01-01' AND '2001-03-31' GROUP BY CLI_ID | CLI_ID CA_T1 ----------- -------------- 1 19328.37 2 123185.22 | It is not possible because of groupage necessary to aggregation caculer other quarters in the same complaint. But it is possible to modify the format to bring it into line with the wishes of our department: SELECT CLI_ID, SUM(CMD_MONTANT) AS CA_T1, 0.0 AS CA_T2, 0.0 AS CA_T3, 0.0 AS CA_T2 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-01-01' AND '2001-03-31' GROUP BY CLI_ID Nothing prevents us now to repeat this request for each quarter and make a union to arrive at a table closest to the request: SELECT CLI_ID, SUM(CMD_MONTANT) AS CA_T1, 0.0 AS CA_T2, 0.0 AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-01-01' AND '2001-03-31' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, SUM(CMD_MONTANT) AS CA_T2, 0.0 AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-04-01' AND '2001-06-30' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, 0.0 AS CA_T2, SUM(CMD_MONTANT) AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-07-01' AND '2001-09-30' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, 0.0 AS CA_T2, 0.0 AS CA_T3, SUM(CMD_MONTANT) AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-10-01' AND '2001-12-31' GROUP BY CLI_ID | CLI_ID CA_T1 CA_T2 CA_T3 CA_T4 ----------- ------------ ----------- ------------- -------------- 1 0.0 0.0 0.0 3587.0 1 19328.38 0.0 0.0 0.0 2 0.0 0.0 0.0 962.28 2 0.0 0.0 61458.0 0.0 2 0.0 82462.05 0.0 0.0 2 123185.22 0.0 0.0 0.0 | Therefore the solution is obvious: just make the sum of these figures quarterly business customer. It is possible to imbriquant the previous request in a FROM clause of a SELECT: SELECT S.CLI_ID, SUM(S.CA_T1) AS CA_T1, SUM(S.CA_T2) AS CA_T2, SUM(S.CA_T3) AS CA_T3, SUM(S.CA_T4) AS CA_T4 FROM (SELECT CLI_ID, SUM(CMD_MONTANT) AS CA_T1, 0.0 AS CA_T2, 0.0 AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-01-01' AND '2001-03-31' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, SUM(CMD_MONTANT) AS CA_T2, 0.0 AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-04-01' AND '2001-06-30' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, 0.0 AS CA_T2, SUM(CMD_MONTANT) AS CA_T3, 0.0 AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-07-01' AND '2001-09-30' GROUP BY CLI_ID UNION SELECT CLI_ID, 0.0 AS CA_T1, 0.0 AS CA_T2, 0.0 AS CA_T3, SUM(CMD_MONTANT) AS CA_T4 FROM COMMANDE WHERE CMD_DATE BETWEEN '2001-10-01' AND '2001-12-31' GROUP BY CLI_ID) AS S GROUP BY S.CLI_ID ORDER BY S.CLI_ID | CLI_ID CA_T1 CA_T2 CA_T3 CA_T4 ----------- ------------ ------------ ------------ ------------- 1 19328.37 0.0 0.0 3587.0 2 123185.22 82462.05 61458.0 962.28 | That is the expected result, and as NULL are ignored by the calculations of aggregation, we have even more zeros pretty explicit that the lack of information! As can be seen, the first complaint gives the same results as last ... Only the presentation changed .... But the latter has in fact 4 more requests under the request gathering under SELECT queries or 5. What is your opinion the most expensive of the two? If you use a procedural program to make this presentation from the first query, you divide the processing time by a huge factor. I bet between 100 and 1000 times faster ... So, you be the judge! Bemol However there is a flat in such a way. Indeed, some RDBMS (as some development tools, Delphi for example) offer a range of technical (or component using such techniques) say ROLAP (Relational On Line Analytical Process). In this case, such complaints are easy to write using a special aggregate (GROUPING) and key words and CUBE ROLLUP. But there is no longer presentation but multidimensional analysis! And your database must have been designed to (star, snowflake, cloud ...). 10. Insertion multiple I can not insert in several tables at once! How? The kinds of data manipulation (INSERT, UPDATE, DELETE) are not made to work over a table. Why? Because in the vast majority of cases of relations between entity, the cardinality not strictly 1:1, RDBMS can not find her grandchildren! By the way, even if this relationship is 1:1 ... impossible, except to summarize in a single table. Indeed, consider a table of men and women and compulsory marriage relationship with cardinality 1:1. This means that every man is married to a woman and one and that, likewise, a woman is married to a man and a single. Exit therefore singles and lively marriage from birth ... Therefore how can we fulfill our table? If we insert a man, coercion referential integrity requires us to specify the wife of this man. On the other hand if we try to insert a woman, then we must clarify her husband, which would imply that her husband is already inserted in the table of men, unconnected with this woman that is impossible because of the referential integrity! So, it is impossible to take any given in this model. So we must break and, for example, set a 0:1 relationship. Since then we can insert data into a table or the other without it being hampered by coercion referential integrity. But this integration can be achieved in a single table at a time! The case is even more striking in relations 1: n or n: m Indeed how to insert a table in one line and in several other lines in a single order SQL? It makes no sense! So how to insert in several tables at once to be sure that all insertions are made? The solution is provided by SQL: this is called a transaction! That is why the existence of transactions: play several orders SQL, validate all of a sudden or cancel all of a sudden.
|
|
|