Business rules

From CDL
Jump to: navigation, search
Managed element
Last edit 2 April 2019 09:55:39
Support contact 
A member of the CDL Team who is responsible for a specific element of the CDL infrastructure.
Simon Schlosser

In the Corporate Data League, business rules ensure data quality and specify how to process collaborative reviews. Currently, 1202 data quality rules are checked to ensure a high level of data quality and 17 review rules ensure efficient processing of collaborative reviews. Each business rule is documented in a form that is understandable by business users referencing the defined data model concepts in the rule definitions.

The standardization approach of the data quality standard across all CDL Members is a collaborative endeavor and managed by the CDL Team. Rules need to be analyzed and adapted on a permanent basis. Existing rules may become inapplicable, or it may be necessary to introduce new rules. Some business rules specify which data values are valid for specific attributes. For this purpose, reference data is collected and documented on these pages (e.g. countries). Integrating new reference data usually goes along with the adaptation of an existing business rule or the creation of new ones.

Business rule use cases

There are different use cases that are supported by the CDL data validation. The list will be continuously extended:

Data quality rules

The list below gives an overview on all data quality rules that are currently documented and implemented.

Business ruleDefinition
Business partner name missingIt is necessary that each business partner has at least one name. With respect to the CDL data model it is at least required that a name of type LOCAL or INTERNATIONAL is present.
Care of information misplacedCare of (typicall indicated by "c/o") information must not be maintained in the business partner's name, locality or thoroughfare but is to be managed as care of attribute. If there is care of information found as an attribute value other than "care of" the rule is violated.
Contact information misplacedContact information is not allowed in the registered name, trade name or international name. This rule checks whether contact information is misplaced by identifying e.g. common keywords such as "attn:" or "z.Hd." and additionally parsing the company name for natural person names that are not meant to be part of the legal name (e.g. when natural person names are placed after the legal form)
EIN format invalid (United States)This rule checks the format of Employer identification number (United States) as described in the additional information tab
Identifier format invalid (European value added tax identifier (Austria))The European value added tax identifier in Austria consists of the prefix "AT" followed by the character "U" and 8 numerical digits. This rule checks the presence of "U" followed by exact 8 digits without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more or wrong placed digits or the "U" is missing then the rule is violated.
Identifier format invalid (European value added tax identifier (Belgium))The European value added tax identifier in Belgium consists of exact 10 numerical digits prefixed by "BE". The first digit following the prefix is always 0 or 1. This rule checks the existence of exact 10 digits without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more, non-numeric digits or when the first digit does not equal 0 or 1 the rule is violated.
Identifier format invalid (European value added tax identifier (Bulgaria))The European value added tax identifier in Bulgaria consists of 9 or 10 numerical digits prefixed by "BG". This rule checks the existence of 9 or 10 digits without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more or non-numeric digits the rule is violated.
Identifier format invalid (European value added tax identifier (Cyprus))The European value added tax identifier in Cyprus consists of 9 characters (8 numerical digits + 1 letter) prefixed by "CY". This rule checks the existence of 8 numerical digits followed by a character without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more or wrong placed digits or the last cipher is not a letter but a numerical digit then the rule is violated.
Identifier format invalid (European value added tax identifier (Czech Republic))This rule checks the format of European value added tax identifier (Czech Republic) consists of 8-10 digits prefixed by "CZ". This rule checks the existence of 8 numerical digits without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more or wrong placed digits then the rule is violated.
Identifier format invalid (European value added tax identifier (Denmark))The European value added tax identifier in Denmark consists of exact 8 numerical digits prefixed by "DK". This rule checks the existence of exact 8 digits without considering possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are less/more or non-numeric digits the rule is violated.
... further results
Business ruleDefinition
Cadastro de Pessoa Fisica invalid (Brazil)The CPF number is an identification number of Brazilian citizens emitted by the Brazilian Ministry of Revenue, which is called "Ministério da Fazenda". CPF stands for "Cadastro de Pessoa Física" (literally, physical person registration) as opposed to the CNPJ number for companies. CPF consists of C1...C11. Where C1...C9 consists of random numbers and C10, C11 are the check numbers. The check number can be calculated by the following methods.

- From right to left all digits are multiplied by a descending sequence starting with 9. - The sum of all products is computed. - The sum of step 2 is taken modulo 11. - The result of step 3 is taken modulo 10.

- The checkdigit found is appended to the number and steps 1 to 4.
Company is greylistedThe rule checks whether a given business partner is known to be inactive by means of being "out of business", "in liquidation" or in a similar status. For this purpose the rule searches for information in the CDL business partner repository and in addition in several connected data sources. These are:
  • UK: Companies House
  • CH: Swiss business register
  • FR: French business register (SIREN)
  • ... further will follow
Fundamental address parts missingIt is necessary that an address, PO Box- or street address, comprises at least a post code or locality.
Identifier Format invalid(Tax identification Number(Italy))Tax Identification Number(Italy) is known as Codice Fiscale and consists of 16 digits of characters, where C1 to C6 Alphabetic, C7, and C8 is Numeric, C9 is Alphabetic, C10, and C11 is Numeric, C12 is Alphabetic, C13 to C15 belongs to Numeric and C16 is the Numeric.

C1 C2 C3 - Are letters for the last name. C4 C5 C6 - Are letter for the name. C7 C8 - Are numbers for the year of birth. C9 - Is a letter for the month of birth. C10 C11 -Are numbers for the day of birth and sex. C12 C13 C14 C15 - Are one letter and three numbers for the Italian town or to the foreign state of birth.

C16 - Have a supervisory function. It is a checksum digit.
Identifier format inaccurate (AFM number (Greece))The AFM number (Greece) for legal entities consists of 9 digits. This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces, hyphens or dots then the rule is violated. This rule checks the syntax, i.e. format of the AFM number (Greece) with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation.
Identifier format inaccurate (Business Number (Australia))This rule checks the syntax, i.e. format of Australian Business Number (ABN) with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation. Australian Business Number (ABN) consists of 11 digits in the format: "99 999 999 999". This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces in a places where no whitespace is expected to be, hyphens or dots then the rule is violated.
Identifier format inaccurate (Business number (Canada))This rule checks the syntax, i.e. format of the Business number in Canada with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation. The Canadian business number consists of exactly 9 numerical digits. This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces, hyphens or dots then the rule is violated.
Identifier format inaccurate (CIF number (Spain))This rule checks the syntax, i.e. format of the CIF number in Spain with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation. CIF number (Spain) consists of a letter followed by 8 digits. This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces, hyphens or dots then the rule is violated.
Identifier format inaccurate (CNPJ number (Brazil))This rule checks the syntax, i.e. format of the CNPJ number (Brazil) with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation. The CNPJ consists of a 14-digit number formatted as 00.000.000/0001-00. This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces, hyphens or dots situated in the places where they are not expected to be then the rule is violated.
Identifier format inaccurate (CUIT number (Argentina))This rule checks the syntax, i.e. format of CUIT number (Argentina) with respect to the reference format. Any deviation (i.e. white spaces where they are not specified) result in a violation. The CUIT number in Argentina consists of 11 numerical digits. This rule checks possible whitespaces, hyphens or dots that might be comprised in the identifier value. If there are any whitespaces, hyphens or dots then the rule is violated.
... further results

Query example

Data quality rules, listed by country and attributes (Excel):

Loading...

Review rules

The list below gives an overview on all review rules which are currently documented and implemented.

Business ruleDefinitionStatus
Auto reject removal of given legal formAuto reject a review if the only difference is a missing Business partner legal form in the update while current data comprises a Business partner legal form.RELEASED
Auto reject update with error defect in addressAuto reject a review if the address violates at least one data quality rule.RELEASED
Auto reject update with error defect in business partnerAuto reject a review if the business partner violates at least one data quality rule.RELEASED
Business ruleDefinitionStatus
Auto reject additional legal addressAdding a legal address is rejected, if a legal address does already exist.DRAFT
Auto reject address country updateDo not allow edits of Country.DRAFT

No inactive auto reject review rules

Business ruleDefinitionStatus
Review address locality updateTrigger manual review for Locality value updates.RELEASED
Review address post code updateTrigger manual review for Post code updates, except the updated value is more precisely than current data.RELEASED
Review address thoroughfare number updateTrigger manual review for Thoroughfare number updates, except the updated value is more precisely than current data.RELEASED
Review address thoroughfare updateTrigger manual review for Thoroughfare value updates.RELEASED
Review business partner identifier updateTrigger manual review if a Identifier is changed.RELEASED
Review business partner legal form updateTrigger manual review if Business partner legal form is changed.RELEASED
Review major business partner name updateTrigger manual review if Name value is changed in a significant way. In this context, "significant" means that the new names differ in more than 2 categories, e.g. uppper/lower case and additional/less punctuation (e.g. -, ., ,, or ;).RELEASED

No draft manual review rules

No inactive manual review rules

No released review processing rules

Business ruleDefinitionStatus
Mandatory review of address removalsRemoval of addresses is always reviewed.DRAFT
Merge of business partner updatesIf there is already a review pending for the provided business partner, the new update is merged into the pending one. Values from the newer update are preferred.DRAFT
Reviews of non-legal addressesA new address is only reviewed, if it is a legal address. Adding any other address does not trigger a review and is accepted as is.DRAFT

No inactive review processing rules

Standardization rules

The list below gives an overview on all business rules to standardize business partners and addresses that are currently documented and implemented.

Contribute!

We are continuously defining and implementing additional rules. Please get in touch with us if you observe that a business rule is missing! Also if you are interested in the business rules management architecture and its implementation we would be happy to provide you with additional information or a showcase.

Technical Implementation

The technical implementation of the business rules uses the semantics defined in this wiki. The knowledge documented in this wiki is stored as RDF triples in a triple store (Jena TBC). The RDF triples are made accessible by a SPARQL endpoint (Jena Fuseki). Business rules are translated into a semantic representation as RDF triples and added to the ontology provided by the endpoint. For representing and executing the rules SPIN is used. SPIN is a collection of RDF vocabularies which enable the use of SPARQL to define constraints and inference rules on Semantic Web models. For checking a business partner or an address for business rule violations, the data record is translated into a semantic representation which is an instantiation of the data model. The business rules do then check whether the instance does confirm to the world defined in this wiki or not.

Theory

From a theoretical point of view the data model concepts and the relations between them define the Corporate Data League domain (in other words the world as it is understood by the CDL). Within this world everything would be possible when there are no rules. Business rules constrain this world by reducing the space of possible instantiations of the modeled domain. An example for this is a business rule that constrains the possible values a country. It says that an allowed value for a country are only those countries that are defined in the ISO 3166-1 standard. These countries are documented as reference data in this portal. Without this rule a country could have any other value such as "Romulus". To take up again the wording from above: The documented countries are knowledge about the CDL world (domain), and this knowledge is used to constrain the possible space of options for the name of a country.