amending, for the purpose of its adaptation to technical progress, the Annex to Regulation (EC) No 440/2008 laying down test methods pursuant to Regulation (EC) No 1907/2006 of the European Parliament and of the Council on the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH)
(Text with EEA relevance)
THE EUROPEAN COMMISSION,
Having regard to the Treaty on the Functioning of the European Union,
Having regard to Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC (1), and in particular Article 13(2) thereof,
Whereas:
(1)
Commission Regulation (EC) No 440/2008 (2) contains the test methods for the purposes of the determination of the physicochemical properties, toxicity and ecotoxicity of chemicals to be applied for the purposes of Regulation (EC) No 1907/2006.
(2)
The Organisation for Economic Cooperation and Development (OECD) develops harmonised and internationally agreed test guidelines for the testing of chemicals for regulatory purposes. The OECD regularly issues new and revised test guidelines, taking account of scientific progress in this area.
(3)
In order to take into account technical progress and, whenever possible, to reduce the number of animals used for experimental purposes in accordance with Article 13(2) of Regulation (EC) No 1907/2006, following the adoption of relevant OECD test guidelines, two new test methods for the assessment of ecotoxicity and nine new test methods for the determination of toxicity to human health should be laid down and seven test methods should be updated. Eleven of those test methods relate to in vitro tests for skin and eye irritation/corrosion, skin sensitisation, genotoxicity and endocrine effects. Stakeholders have been consulted on the proposed amendment.
(4)
Regulation (EC) No 440/2008 should therefore be amended accordingly.
(5)
The measures provided for in this Regulation are in accordance with the opinion of the Committee established under Article 133 of Regulation (EC) No 1907/2006,
HAS ADOPTED THIS REGULATION:
Article 1
The Annex to Regulation (EC) No 440/2008 is amended in accordance with the Annex to this Regulation.
Article 2
This Regulation shall enter into force on the twentieth day following that of its publication in the Official Journal of the European Union.
This Regulation shall be binding in its entirety and directly applicable in all Member States.
(2) Commission Regulation (EC) No 440/2008 of 30 May 2008 laying down test methods pursuant to Regulation (EC) No 1907/2006 of the European Parliament and of the Council on the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) (OJ L 142, 31.5.2008, p. 1).
ANNEX
The Annex to Regulation (EC) No 440/2008 is amended as follows:
(1)
In part B, Chapter B.4 is replaced by the following:
"B.4 ACUTE DERMAL IRRITATION/CORROSION
INTRODUCTION
1.
This test method is equivalent to OECD test guideline (TG) 404 (2015). OECD guidelines for testing of Chemicals are periodically reviewed to ensure that they reflect the best available science. In the review of OECD TG 404, special attention was given to possible improvements in relation to animal welfare concerns and to the evaluation of all existing information on the test chemical in order to avoid unnecessary testing in laboratory animals. The updated version of OECD TG 404 (originally adopted in 1981, revised in 1992, 2002 and 2015) includes reference to the Guidance Document on Integrated Approaches to Testing and Assessment (IATA) for Skin Irritation/Corrosion (1), proposing a modular approach for skin irritation and skin corrosion testing. The IATA describes several modules which group information sources and analysis tools, and (i) provides guidance on how to integrate and use existing testing and non-testing data for the assessment of the skin irritation and skin corrosion potentials of chemicals and (ii) proposes an approach when further testing is needed (1). In addition, where needed, the successive, instead of simultaneous, application of the three test patches to the animal in the initial in vivo test is recommended in that Guideline.
2.
Definitions of dermal irritation and corrosion are set out in the Appendix to this test method.
INITIAL CONSIDERATIONS
3.
In the interest of both sound science and animal welfare, in vivo testing should not be undertaken until all available data relevant to the potential dermal corrosivity/irritation of the test chemical have been evaluated in a weight-of-the-evidence (WoE) analysis as presented in the Guidance Document on Integrated Approaches to Testing and Assessment for Skin Corrosion and Irritation, i.e. over the three Parts of this guidance and their corresponding modules (1). Briefly, under Part 1 existing data is addressed over seven modules covering human data, in vivo data, in vitro data, physico-chemical properties data (e.g. pH, in particular strong acidity or alkalinity) and non-testing methods. Under Part 2, WoE analysis is performed. If this WoE is still inconclusive, Part 3 should be conducted with additional testing, starting with in vitro methods, and in vivo testing is used as last resort. This analysis should therefore decrease the need for in vivo testing for dermal corrosivity/irritation of test chemicals for which sufficient evidence already exists from other studies as to those two endpoints.
PRINCIPLE OF THE IN VIVO TEST
4.
The test chemical to be tested is applied in a single dose to the skin of an experimental animal; untreated skin areas of the test animal serve as the control. The degree of irritation/corrosion is read and scored at specified intervals and is further described in order to provide a complete evaluation of the effects. The duration of the study should be sufficient to evaluate the reversibility or irreversibility of the effects observed.
5.
Animals showing continuing signs of severe distress and/or pain at any stage of the test should be humanely killed, and the test chemical assessed accordingly. Criteria for making the decision to humanely kill moribund and severely suffering animals are the subject of a separate Guidance Document (2).
PREPARATIONS FOR THE IN VIVO TEST
Selection of animal species
6.
The albino rabbit is the preferable laboratory animal, and healthy young adult rabbits are used. A rationale for using other species should be provided.
Preparation of the animals
7.
Approximately 24 hours before the test, fur should be removed by closely clipping the dorsal area of the trunk of the animals. Care should be taken to avoid abrading the skin, and only animals with healthy, intact skin should be used.
8.
Some strains of rabbit have dense patches of hair that are more prominent at certain times of the year. Such areas of dense hair growth should not be used as test sites.
Housing and feeding conditions
9.
Animals should be individually housed. The temperature of the experimental animal room should be 20 °C (± 3 °C) for rabbits. Although the relative humidity should be at least 30 % and preferably not exceed 70 %, other than during room cleaning, the aim should be 50-60 %. Lighting should be artificial, the sequence being 12 hours light, 12 hours dark. For feeding, conventional laboratory diets may be used with an unrestricted supply of drinking water
TEST PROCEDURE
Application of the test chemical
10.
The test chemical should be applied to a small area (approximately 6 cm2) of skin and covered with a gauze patch, which is held in place with non-irritating tape. In cases in which direct application is not possible (e.g. liquids or some pastes), the test chemical should first be applied to the gauze patch, which is then applied to the skin. The patch should be loosely held in contact with the skin by means of a suitable semi-occlusive dressing for the duration of the exposure period. If the test chemical is applied to the patch, it should be attached to the skin in such a manner that there is good contact and uniform distribution of the test chemical on the skin. Access by the animal to the patch and ingestion or inhalation of the test chemical should be prevented.
11.
Liquid test chemicals are generally used undiluted. When testing solids (which may be pulverised, if considered necessary), the test chemical should be moistened with the smallest amount of water (or, where necessary, of another suitable vehicle) sufficient to ensure good skin contact. When vehicles other than water are used, the potential influence of the vehicle on irritation of the skin by the test chemical should be minimal, if any.
12.
At the end of the exposure period, which is normally 4 hours, residual test chemical should be removed, where practicable, using water or an appropriate solvent without altering the existing response or the integrity of the epidermis.
Dose level
13.
A dose of 0,5 ml of liquid or 0,5 g of solid or paste is applied to the test site.
Initial test (In vivo dermal irritation/corrosion test using one animal)
14.
When a test chemical has been judged to be corrosive, irritant or non-classified on the basis of a weight of evidence analyses or of previous in vitro testing, further in vivo testing is normally not necessary. However, in the cases where additional data are felt warranted, the in vivo test is performed initially using one animal and applying the following approach. Up to three test patches are applied sequentially to the animal. The first patch is removed after three minutes. If no serious skin reaction is observed, a second patch is applied at a different site and removed after one hour. If the observations at this stage indicate that exposure can humanely be allowed to extend to four hours, a third patch is applied and removed after four hours, and the response is graded.
15.
If a corrosive effect is observed after any of the three sequential exposures, the test is immediately terminated. If a corrosive effect is not observed after the last patch is removed, the animal is observed for 14 days, unless corrosion develops at an earlier time point.
16.
In those cases in which the test chemical is not expected to produce corrosion but may be irritating, a single patch should be applied to one animal for four hours.
Confirmatory test (In vivo dermal irritation test with additional animals)
17.
If a corrosive effect is not observed in the initial test, the irritant or negative response should be confirmed using up to two additional animals, each with one patch, for an exposure period of four hours. If an irritant effect is observed in the initial test, the confirmatory test may be conducted in a sequential manner, or by exposing two additional animals simultaneously. In the exceptional case, in which the initial test is not conducted, two or three animals may be treated with a single patch, which is removed after four hours. When two animals are used, if both exhibit the same response, no further testing is needed. Otherwise, the third animal is also tested. Equivocal responses may need to be evaluated using additional animals.
Observation period
18.
The duration of the observation period should be sufficient to evaluate fully the reversibility of the effects observed. However, the experiment should be terminated at any time that the animal shows continuing signs of severe pain or distress. To determine the reversibility of effects, the animals should be observed up to 14 days after removal of the patches. If reversibility is seen before 14 days, the experiment should be terminated at that time.
Clinical observations and grading of skin reactions
19.
All animals should be examined for signs of erythema and oedema, and the responses scored at 60 minutes, and then at 24, 48 and 72 hours after patch removal. For the initial test in one animal, the test site is also examined immediately after the patch has been removed. Dermal reactions are graded and recorded according to the grades in the Table below. If there is damage to skin which cannot be identified as irritation or corrosion at 72 hours, observations may be needed until day 14 to determine the reversibility of the effects. In addition to the observation of irritation, all local toxic effects, such as defatting of the skin, and any systemic adverse effects (e.g. effects on clinical signs of toxicity and body weight), should be fully described and recorded. Histopathological examination should be considered to clarify equivocal responses.
20.
The grading of skin responses is necessarily subjective. To promote harmonisation in grading of skin response and to assist testing laboratories and those involved in making and interpreting the observations, the personnel performing the observations need to be adequately trained in the scoring system used (see Table below). An illustrated guide for grading skin irritation and other lesions could be helpful (3).
DATA AND REPORTING
21.
Study results should be summarised in tabular form in the final test report and should cover all items listed in paragraph 24.
Evaluation of results
22.
The dermal irritation scores should be evaluated in conjunction with the nature and severity of lesions, and their reversibility or lack of reversibility. The individual scores do not represent an absolute standard for the irritant properties of a material, as other effects of the test material are also evaluated. Instead, individual scores should be viewed as reference values, which need to be evaluated in combination with all other observations from the study.
23.
Reversibility of dermal lesions should be considered in evaluating irritant responses. When responses such as alopecia (limited area), hyperkeratosis, hyperplasia and scaling, persist to the end of the 14-day observation period, the test chemical should be considered an irritant.
Test report
24.
The test report must include the following information:
Rationale for in vivo testing:
—
Weight-of-evidence analysis of pre-existing test data, including results from sequential testing strategy;
—
Description of relevant data available from prior testing;
—
Data derived at each stage of testing strategy;
—
Description of in vitro tests performed, including details of procedures, results obtained with test/reference substances;
—
Weight-of-the-evidence analysis for performing in vivo study.
Test chemical:
—
Mono-constituent substance: chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc;
—
Multi-constituent substance, mixture and substances of unknown or variable composition, complex reaction products or biological materials (UVCB): characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physico-chemical properties of the constituents;
—
Physical appearance, water solubility, and additional relevant physico-chemical properties;
—
Source, lot number if available;
—
Treatment of the test chemical/control substance prior to testing, if applicable (e.g. warming, grinding);
—
Stability of the test chemical, limit date for use, or date for re-analysis if known;
Species/strain used, rationale for using animal(s) other than albino rabbit;
—
Number of animal(s) of each sex;
—
Individual animal weight(s) at start and conclusion of test;
—
Age at start of study;
—
Source of animal(s), housing conditions, diet, etc.
Test conditions:
—
Technique of patch site preparation;
—
Details of patch materials used and patching technique;
—
Details of test chemical preparation, application, and removal.
Results:
—
Tabulation of irritation/corrosion response scores for each animal at all time points measured;
—
Descriptions of all lesions observed;
—
Narrative description of nature and degree of irritation or corrosion observed, and any histopathological findings;
—
Description of other adverse local (e.g. defatting of skin) and systemic effects in addition to dermal irritation or corrosion.
Discussion of results
Conclusions
LITERATURE
(1)
OECD (2014). Guidance document on Integrated Approaches to Testing and Assessment for Skin Irritation/Corrosion. Environmental Health and Safety Publications, Series on Testing and Assessment, (No 203), Organisation for Economic Cooperation and Development, Paris.
(2)
OECD (1998) Harmonized Integrated Hazard Classification System for Human Health and Environmental Effects of Chemical Substances, as endorsed by the 28th Joint Meeting of the Chemicals Committee and the Working Party on Chemicals, November 1998.
(3)
OECD (2000). Guidance Document on the Recognition, Assessment and Use of Clinical Signs as Humane Endpoints for Experimental Animals Used in Safety Evaluation. Environmental Health and Safety Publications, Series on Testing and Assessment (No 19), Organistion for Economic Cooperation and Development, Paris.
Table
Grading of Skin Reactions
No erythema…
0
Very slight erythema (barely perceptible)…
1
Well defined erythema…
2
Moderate to severe erythema…
3
Severe erythema (beef redness) to eschar formation preventing grading of erythema…
4
Maximum possible: 4
No oedema…
0
Very slight oedema (barely perceptible)…
1
Slight oedema (edges of area well defined by definite raising)…
2
Moderate oedema (raised approximately 1 mm)…
3
Severe oedema (raised more than 1 mm and extending beyond area of exposure)…
4
Maximum possible: 4
Histopathological examination may be carried out to clarify equivocal responses.
Appendix
DEFINITIONS
Chemical is a substance or a mixture.
Dermal irritation is the production of reversible damage of the skin following the application of a test chemical for up to 4 hours.
Dermal corrosion is the production of irreversible damage of the skin; namely, visible necrosis through the epidermis and into the dermis, following the application of a test chemical for up to four hours. Corrosive reactions are typified by ulcers, bleeding, bloody scabs, and, by the end of observation at 14 days, by discolouration due to blanching of the skin, complete areas of alopecia, and scars. Histopathology should be considered to evaluate questionable lesions.
Test chemical is any substance or mixture tested using this test method
"
(2)
In Part B, Chapter B.17 is replaced by the following:
"B.17 IN VITRO MAMMALIAN CELL GENE MUTATION TESTS USING THE HPRT AND XPRT GENES
INTRODUCTION
1.
This test method (TM) is equivalent to the OECD test guideline 476 (2016). Test methods are periodically reviewed in the light of scientific progress, changing regulatory needs and animal welfare. This current revised version of TM B.17 reflects nearly thirty years of experience with this test and also results from the development of a separate new method dedicated to in vitro mammalian cell gene mutation tests using the thymidine kinase gene. TM B.17 is part of a series of test methods on genetic toxicology. A document that provides succinct information on genetic toxicology testing and an overview of the recent changes that were made to genetic toxicity OECD test guidelines has been developed by OECD (1).
2.
The purpose of the in vitro mammalian cell gene mutation test is to detect gene mutations induced by chemicals. The cell lines used in these tests measure forward mutations in reporter genes, specifically the endogeneous hypoxanthine-guanine phosphoribosyl transferase gene (Hprt in rodent cells, HPRT in human cells; collectively referred to as the Hprt gene and HPRT test in this test method), and the xanthine-guanine phosphoribosyl transferase transgene (gpt) (referred to as the XPRT test). The HPRT and XPRT mutation tests detect different spectra of genetic events. In addition to the mutational events detected by the HPRT test (e.g. base pair substitutions, frameshifts, small deletions and insertions) the autosomal location of the gpt transgene may allow the detection of mutations resulting from large deletions and possibly mitotic recombination not detected by the HPRT test because the Hprt gene is located on the X-chromosome (2) (3) (4) (5) (6) (7). The XPRT is currently less widely used than the HPRT test for regulatory purposes.
3.
Definitions used are provided in Appendix 1.
INITIAL CONSIDERATIONS AND LIMITATIONS
4.
Tests conducted in vitro generally require the use of an exogenous source of metabolic activation. The exogenous metabolic activation system does not entirely mimic in vivo conditions.
5.
Care should be taken to avoid conditions that would lead to artefactual positive results, (i.e. possible interaction with the test system), not caused by direct interaction between the test chemicals and the genetic material of the cell; such conditions include changes in pH or osmolality (8) (9) (10), interaction with the medium components (11) (12), or excessive levels of cytotoxicity (13). Cytotoxicity exceeding the recommended top cytotoxicity levels as defined in paragraph 19 is considered excessive for the HPRT test.
6.
Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed when there is a regulatory requirement for testing of the mixture.
PRINCIPLE OF THE TEST
7.
Mutant cells deficient in Hprt enzyme activity in the HPRT test or xprt enzyme activity in the XPRT test are resistant to the cytostatic effects of the purine analogue 6-thioguanine (TG). The Hprt (in the HPRT test) or gpt (in XPRT test) proficient cells are sensitive to TG, which causes the inhibition of cellular metabolism and halts further cell division. Thus, mutant cells are able to proliferate in the presence of TG, whereas normal cells, which contain the Hprt (in the HPRT test) or gpt (in XPRT test) enzyme, are not.
8.
Cells in suspension or monolayer cultures are exposed to the test chemical, both with and without an exogenous source of metabolic activation (see paragraph 14), for a suitable period of time (3-6 hours), and then sub-cultured to determine cytotoxicity and to allow phenotypic expression prior to mutant selection (14) (15) (16) (17). Cytotoxicity is determined by relative survival (RS), i.e. cloning efficiency measured immediately after treatment and adjusted for any cell loss during treatment as compared to the negative control (paragraph 18 and Appendix 2). The treated cultures are maintained in growth medium for a sufficient period of time, characteristic of each cell type, to allow near-optimal phenotypic expression of induced mutations (typically a minimum of 7-9 days). Following phenotypic expression, mutant frequency is determined by seeding known numbers of cells in medium containing the selective agent to detect mutant colonies, and in medium without selective agent to determine the cloning efficiency (viability). After a suitable incubation time, colonies are counted. Mutant frequency is calculated based on the number of mutant colonies corrected by the cloning efficiency at the time of mutant selection.
DESCRIPTION OF THE METHOD
Preparations
Cells
9.
The cell types used for the HPRT and XPRT tests should have a demonstrated sensitivity to chemical mutagens, a high cloning efficiency, a stable karyotype, and a stable spontaneous mutant frequency. The most commonly used cells for the HPRT test include the CHO, CHL and V79 lines of Chinese hamster cells, L5178Y mouse lymphoma cells, and TK6 human lymphoblastoid cells (18) (19). CHO-derived AS52 cells containing the gpt transgene (and having the Hprt gene deleted) are used for the XPRT test (20) (21); the HPRT test cannot be performed in AS52 cells because the hprt gene has been deleted. The use of other cell lines should be justified and validated.
10.
Cell lines should be checked routinely for the stability of the modal chromosome number and the absence of Mycoplasma contamination (22) (23), and cells should not be used if contaminated or if the modal chromosome number has changed. The normal cell cycle time used in the testing laboratory should be established and should be consistent with the published cell characteristics. The spontaneous mutant frequency in the master cell stock should also be checked, and the stock should not be used if the mutant frequency is not acceptable.
11.
Prior to use in this test, the cultures may need to be cleansed of pre-existing mutant cells, e.g.by culturing in HAT medium for HPRT test and MPA for XPRT test (5) (24) (See Appendix 1). The cleansed cells can be cryopreserved and then thawed to use as working stocks. The newly thawed working stock can be used for the test after normal doubling times are attained. When conducting the XPRT test, routine culture of AS52 cells should use conditions that assure the maintenance of the gpt transgene (20).
Media and culture conditions
12.
Appropriate culture medium and incubation conditions (culture vessels, humidified atmosphere of 5 % CO2, and incubation temperature of 37 °C) should be used for maintaining cultures. Cell cultures should always be maintained under conditions that ensure that they are growing in log phase. It is particularly important that media and culture conditions be chosen to ensure optimal growth of cells during the expression period and optimal cloning efficiency for both mutant and non-mutant cells.
Preparation of cultures
13.
Cell lines are propagated from stock cultures, seeded in culture medium at a density such that the cells in suspensions or in monolayers will continue to grow exponentially through the treatment and expression periods (e.g. confluence should be avoided for cells growing in monolayers).
Metabolic activation
14.
Exogenous metabolising systems should be used when employing cells which have inadequate endogenous metabolic capacity. The most commonly used system, that is recommended by default, unless otherwise justified, is a co-factor-supplemented post-mitochondrial fraction (S9) prepared from the livers of rodents (generally rats) treated with enzyme-inducing agents such as Aroclor 1254 (25) (26) (27) (28) or a combination of phenobarbital and β-naphthoflavone (29) (30) (31) (32). The latter combination does not conflict with the Stockholm Convention on Persistent Organic Pollutants (33) and has been shown to be as effective as Aroclor 1254 for inducing mixed-function oxidases (29) (31). The S9 fraction typically is used at concentrations ranging from 1 to 2 % (v/v) but may be increased to 10 % (v/v) in the final test medium. The choice of the type and concentration of exogenous metabolic activation system or metabolic inducer employed may be influenced by the class of substances being tested (34) (35) (36).
Test chemical preparation
15.
Solid test chemicals should be prepared in appropriate solvents and diluted, if appropriate, prior to treatment of the cells (see paragraph 16). Liquid test chemicals may be added directly to the test system and/or diluted prior to treatment of the test system. Gaseous or volatile test chemicals should be tested by appropriate modifications to the standard protocols, such as treatment in sealed culture vessels (37) (38). Preparations of the test chemical should be made just prior to treatment unless stability data demonstrate the acceptability of storage.
TEST CONDITIONS
Solvents
16.
The solvent should be chosen to optimise the solubility of the test chemicals without adversely impacting the conduct of the test e.g. changing cell growth, affecting the integrity of the test chemical, reacting with culture vessels, impairing the metabolic activation system. It is recommended that, wherever possible, the use of an aqueous solvent (or culture medium) should be considered first. Well established solvents are for example, water and dimethyl sulfoxide. Generally, organic solvents should not exceed 1 % (v/v) and aqueous solvents (saline or water) should not exceed 10 % (v/v) in the final treatment medium. If the solvents used are not well-established (e.g. ethanol or acetone), their use should be supported by data indicating their compatibility with the test chemicals and the test system, and their lack of genetic toxicity at the concentration used. In the absence of that supporting data, it is important to add untreated controls (see Appendix 1) to demonstrate that no deleterious or mutagenic effects are induced by the chosen solvent.
Measuring cytotoxicity and choosing exposure concentrations
17.
When determining the highest test chemical concentration, concentrations that have the capability of producing artefactual positive responses, such as those producing excessive cytotoxicity (see paragraph 20), precipitation in the culture medium (see paragraph 21), or marked changes in pH or osmolality (see paragraph 5) should be avoided. If the test chemical causes a marked change in the pH of the medium at the time of addition, the pH might be adjusted by buffering the final treatment medium so as to avoid artefactual positive results and to maintain appropriate culture conditions.
18.
Concentration selection is based on cytotoxicity and other considerations (see paragraphs 20-22). While the evaluation of cytotoxicity in an initial test may be useful to better define the concentrations to be used in the main experiment, an initial test is not required. Even if an initial cytotoxicity evaluation is performed, the measurement of cytotoxicity for each culture is still required in the main experiment. Cytotoxicity should be evaluated using RS, i.e. cloning efficiency (CE) of cells plated immediately after treatment, adjusted by any loss of cells during treatment, based on cell count, as compared with adjusted cloning efficiency in negative controls (assigned a survival of 100 %) (see Appendix 2 for the formula).
19.
At least four test concentrations (not including the solvent and positive controls) that meet the acceptability criteria (appropriate cytotoxicity, number of cells, etc.) should be evaluated. While the use of duplicate cultures is advisable, either replicate or single treated cultures may be used at each concentration tested. The results obtained in the independent replicate cultures at a given concentration should be reported separately but can be pooled for the data analysis (17). For test chemicals demonstrating little or no cytotoxicity, concentration intervals of approximately 2 to 3 fold will usually be appropriate. Where cytotoxicity occurs, the test concentrations selected should cover a range from that producing cytotoxicity to concentrations at which there is moderate and little or no cytotoxicity. Many test chemicals exhibit steep concentration response curves and in order to cover the whole range of cytotoxicity or to study the concentration response relationship in detail, it may be necessary to use more closely spaced concentrations and more than four concentrations, in particular in situations where a repeat experiment is required (see paragraph 43). The use of more than 4 concentrations may be particularly important when using single cultures.
20.
If the maximum concentration is based on cytotoxicity, the highest concentration should aim to achieve between 20 and 10 % RS. Care should be taken when interpreting positive results only found at 10 % RS or below (paragraph 43).
21.
For poorly soluble test chemicals that are not cytotoxic at concentrations below the lowest insoluble concentration, the highest concentration analysed should produce turbidity or a precipitate visible by eye or with the aid of an inverted microscope at the end of the treatment with the test chemical. Even if cytotoxicity occurs above the lowest insoluble concentration, it is advisable to test at only one concentration producing turbidity or with a visible precipitate because artefactual effects may result from the precipitate. At the concentration producing a precipitate, care should be taken to assure that the precipitate does not interfere with the conduct of the test. The determination of solubility in the culture medium prior to the experiment may be useful.
22.
If no precipitate or limiting cytotoxicity is observed, the highest test concentration should correspond to 10 mM, 2 mg/ml or 2 μl/ml, whichever is the lowest (39) (40). When the test chemical is not of defined composition, e.g. substance of unknown or variable composition, complex reaction products or biological materials (i.e. Chemical Substances of Unknown or Variable Composition (UVCBs)) (41), environmental extracts, etc., the top concentration may need to be higher (e.g. 5 mg/mL), in the absence of sufficient cytotoxicity, to increase the concentration of each of the components. It should be noted however that these requirements may differ for human pharmaceuticals (42).
Controls
23.
Concurrent negative controls (see paragraph 16), consisting of solvent alone in the treatment medium and handled in the same way as the treatment cultures, should be included for every experimental condition.
24.
Concurrent positive controls are needed to demonstrate the ability of the laboratory to identify mutagens under the conditions of the test protocol used and the effectiveness of the exogenous metabolic activation system, when applicable. Examples of positive controls are given in Table 1 below. Alternative positive control substances can be used, if justified. Because in vitro mammalian cell tests for genetic toxicity are sufficiently standardised, tests using treatments with and without exogenous metabolic activation may be conducted using only a positive control requiring metabolic activation. In this case, this single positive control response will demonstrate both the activity of the metabolic activation system and the responsiveness of the test system. Each positive control should be used at one or more concentrations expected to give reproducible and detectable increases over background in order to demonstrate the sensitivity of the test system, and the response should not be compromised by cytotoxicity exceeding the limits specified in this test method (see paragraph 20).
Table 1
Reference substances recommended for assessing laboratory proficiency and for selection of positive controls
Proliferating cells are treated with the test chemical in the presence and absence of a metabolic activation system. Exposure should be for a suitable period of time (usually 3 to 6 hours is adequate).
26.
The minimum number of cells used for each test (control and treated) culture at each stage in the test should be based on the spontaneous mutant frequency. A general guide is to treat and passage sufficient cells as to maintain 10 spontaneous mutants in every culture in all phases of the test (17). The spontaneous mutant frequency is generally between 5 and 20 × 10-6. For a spontaneous mutant frequency of 5 × 10-6 and to maintain a sufficient number of spontaneous mutants (10 or more) even for the cultures treated at concentrations that cause 90 % cytotoxicity during treatment (10 % RS), it would be necessary to treat at least 20 × 106 cells. In addition a sufficient number of cells (but never less than 2 million) must be cultured during the expression period and plated for mutant selection (17).
Phenotypic expression time and measuring mutant frequency
27.
After the treatment period, cells are cultured to allow expression of the mutant phenotype. A minimum of 7 to 9 days generally is sufficient to allow near optimal phenotypic expression of newly induced Hprt and xprt mutants (43) (44). During this period, cells are regularly sub-cultured to maintain them in exponential growth. After phenotypic expression, cells are re-plated in medium with and without selective agent (6-thioguanine) for the determination of the number of mutants and cloning efficiency at the time of selection, respectively. This plating can be accomplished using dishes for monolayer cultures or microwell plates for cells in suspension. For mutant selection, cells should be plated at a density to assure optimum mutant recovery (i.e. avoid metabolic cooperation) (17). Plates are incubated for an appropriate length of time for optimum colony growth (e.g. 7-12 days) and colonies counted. Mutant frequency is calculated based on the number of mutant colonies corrected by the cloning efficiency at the time of mutant selection (see Appendix 2 for formulas).
Proficiency of the laboratory
28.
In order to establish sufficient experience with the test prior to using it for routine testing, the laboratory should have performed a series of experiments with reference positive substances acting via different mechanisms (at least one active with and one active without metabolic activation selected from the substances listed in Table 1) and various negative controls (using various solvents/vehicles). These positive and negative control responses should be consistent with the literature. This is not applicable to laboratories that have experience, i.e. that have an historical data base available as defined in paragraphs 30 to 33.
29.
A selection of positive control substances (see Table 1 in paragraph 25) should be investigated in the absence and in the presence of metabolic activation, in order to demonstrate proficiency to detect mutagenic chemicals, to determine the effectiveness of the metabolic activation system and to demonstrate the appropriateness of the cell growth conditions during treatment, phenotypic expression and mutant selection and of the scoring procedures. A range of concentrations of the selected substances should be chosen so as to give reproducible and concentration-related increases above the background in order to demonstrate the sensitivity and dynamic range of the test system.
Historical control data
30.
The laboratory should establish:
—
A historical positive control range and distribution,
—
A historical negative (untreated, solvent) control range and distribution.
31.
When first acquiring data for an historical negative control distribution, concurrent negative controls should be consistent with published control data (22). As more experimental data are added to the control distribution, concurrent negative controls should ideally be within the 95 % control limits of that distribution (17) (45) (46).
32.
The laboratory’s historical negative control database should initially be built with a minimum of 10 experiments but would preferably consist of at least 20 experiments conducted under comparable experimental conditions. Laboratories should use quality control methods, such as control charts (e.g. C-charts or X-bar charts (47)), to identify how variable their positive and negative control data are, and to show that the methodology is ‘under control’ in their laboratory (46). Further recommendations on how to build and use the historical data (i.e. criteria for inclusion and exclusion of data in historical data and the acceptability criteria for a given experiment) can be found in the literature (45).
33.
Negative control data should consist of mutant frequencies from single or preferably replicate cultures as described in paragraph 23. Concurrent negative controls should ideally be within the 95 % control limits of the distribution of the laboratory’s historical negative control database (17) (45) (46). Where concurrent negative control data fall outside the 95 % control limit they may be acceptable for inclusion in the historical control distribution as long as these data are not extreme outliers and there is evidence that the test system is ‘under control’ (see above) and there is evidence of no technical or human failure.
34.
Any changes to the experimental protocol should be considered in terms of their consistency with the laboratory’s existing historical control databases. Any major inconsistencies should result in the establishment of a new historical control database.
DATA AND REPORTING
Presentation of the results
35.
The presentation of results should include all of the data needed to calculate cytotoxicity (expressed as RS). The data, for both treated and control cultures, should include the number of cells at the end of treatment, the number of cells plated immediately following treatment, and the colony counts (or number of wells without colonies for the microwell method). RS for each culture should be expressed as a percentage relative to the concurrent solvent control (refer to Appendix 1 for definitions).
36.
The presentation of results should also include all of the data needed to calculate the mutant frequency. Data for both treated and control cultures, should include: (1) the number of cells plated with and without selective agent (at the time the cells are plated for mutant selection), and (2) the number of colonies counted (or the number of wells without colonies for the microwell method) from the plates with and without selective agent. Mutant frequency is calculated based on the number of mutant colonies (in the plates with selective agent) corrected by the cloning efficiency (from the plates without selective agent). The mutant frequency should be expressed as the number of mutant cells per million viable cells (refer to Appendix 1 for definitions).
37.
Individual culture data should be provided. Additionally, all data should be summarised in tabular form.
Acceptability Criteria
38.
Acceptance of a test is based on the following criteria:
—
The concurrent negative control is considered acceptable for addition to the laboratory historical negative control database as described in paragraph 33.
—
Concurrent positive controls (see paragraph 24) should induce responses that are compatible with those generated in the historical positive control data base and produce a statistically significant increase compared with the concurrent negative control.
—
Two experimental conditions (i.e. with and without metabolic activation) were tested unless one resulted in positive results (see paragraph 25).
—
Adequate number of cells and concentrations are analysable (paragraphs 25, 26 and 19).
—
The criteria for the selection of top concentration are consistent with those described in paragraphs 20, 21 and 22.
Evaluation and interpretation of results
39.
Providing that all acceptability criteria are fulfilled, a test chemical is considered to be clearly positive if, in any of the experimental conditions examined:
—
at least one of the test concentrations exhibits a statistically significant increase compared with the concurrent negative control,
—
the increase is concentration-related when evaluated with an appropriate trend test,
—
any of the results are outside the distribution of the historical negative control data (e.g. Poisson-based 95 % control limit; see paragraph 33).
When all of these criteria are met, the test chemical is then considered able to induce gene mutations in cultured mammalian cells in this test system. Recommendations for the most appropriate statistical methods can be found in the literature (46) (48).
40.
Providing that all acceptability criteria are fulfilled, a test chemical is considered clearly negative if, in all experimental conditions examined:
—
none of the test concentrations exhibits a statistically significant increase compared with the concurrent negative control,
—
there is no concentration-related increase when evaluated with an appropriate trend test,
—
all results are inside the distribution of the historical negative control data (e.g. Poisson-based 95 % control limit; see paragraph 33).
The test chemical is then considered unable to induce gene mutations in cultured mammalian cells in this test system.
41.
There is no requirement for verification of a clearly positive or negative response.
42.
In cases when the response is neither clearly negative nor clearly positive as described above, or in order to assist in establishing the biological relevance of a result, the data should be evaluated by expert judgement and/or further investigations. Performing a repeat experiment possibly using modified experimental conditions (e.g. concentration spacing, other metabolic activation conditions [i.e. S9 concentration or S9 origin]) could be useful.
43.
In rare cases, even after further investigations, the data set will preclude making a conclusion of positive or negative results. Therefore the test chemical response should be concluded to be equivocal (interpreted as equally likely to be positive or negative).
Test report
44.
The test report should include the following information:
Test chemical:
—
source, lot number, limit date for use, if available;
—
stability of the test chemical itself, if known;
—
solubility and stability of the test chemical in solvent, if known;
—
measurement of pH, osmolality and precipitate in the culture medium to which the test chemical was added, as appropriate.
Mono-constituent substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents.
Solvent:
—
justification for choice of solvent;
—
percentage of solvent in the final culture medium.
Cells:
For Laboratory master cultures:
—
type, source of cell lines;
—
number of passages, if available, and history in the laboratory;
—
karyotype features and/or modal number of chromosomes;
—
methods for maintenance of cell cultures;
—
absence of mycoplasma;
—
cell doubling times.
Test conditions:
—
rationale for selection of concentrations and number of cultures including, e.g. cytotoxicity data and solubility limitations;
—
composition of media, CO2 concentration, humidity level;
—
concentration of test chemical expressed as final concentration in the culture medium (e.g. μg or mg/ml or mM of culture medium);
—
concentration (and/or volume) of solvent and test chemical added in the culture medium;
—
incubation temperature;
—
incubation time;
—
duration of treatment;
—
cell density during treatment;
—
type and composition of metabolic activation system (source of S9, method of preparation of the S9 mix, the concentration or volume of S9 mix and S9 in the final culture medium, quality controls of S9);
—
positive and negative control substances, final concentrations for each condition of treatment;
—
length of expression period (including number of cells seeded, and subcultures and feeding schedules, if appropriate);
—
identity of the selective agent and its concentration;
—
criteria for acceptability of tests;
—
methods used to enumerate numbers of viable and mutant cells;
—
methods used for the measurements of cytotoxicity;
—
any supplementary information relevant to cytotoxicity and method used;
—
duration of incubation times after plating;
—
criteria for considering studies as positive, negative or equivocal;
—
methods used to determine pH, osmolality and precipitation.
Results:
—
number of cells treated and number of cells sub-cultured for each culture;
—
cytotoxicity measurements and other observations if any;
—
signs of precipitation and time of the determination;
—
number of cells plated in selective and non-selective medium;
—
number of colonies in non-selective medium and number of resistant colonies in selective medium, and related mutant frequencies;
—
concentration-response relationship, where possible;
—
concurrent negative (solvent) and positive control data (concentrations and solvents);
—
historical negative (solvent) and positive control data, with ranges, means and standard deviations and confidence interval (e.g. 95 %) as well as the number of data;
—
statistical analyses (for individual cultures and pooled replicates if appropriate), and p-values if any.
Discussion of the results.
Conclusion
LITERATURE
(1)
OECD (2016). Overview of the set of OECD Genetic Toxicology Test Guidelines and updates performed in 2014-2015. ENV Publications. Series on Testing and Assessment, No 234, OECD, Paris.
(2)
Moore M.M., DeMarini D.M., DeSerres F.J. and Tindall, K.R. (Eds.) (1987). Banbury Report 28: Mammalian Cell Mutagenesis, Cold Spring Harbor Laboratory, New York, New York.
(3)
Chu E.H.Y. and Malling H.V. (1968). Mammalian Cell Genetics. II. Chemical Induction of Specific Locus Mutations in Chinese Hamster Cells In Vitro, Proc. Natl. Acad. Sci., USA, 61, 1306-1312.
(4)
Moore M.M., Harrington-Brock K., Doerr C.L. and Dearfield K.L. (1989). Differential Mutant Quantitation at the Mouse Lymphoma TK and CHO HGPRT Loci. Mutagen. Res., 4, 394-403.
(5)
Aaron C.S. and Stankowski Jr. L.F. (1989). Comparison of the AS52/XPRT and the CHO/HPRT Assays: Evaluation of Six Drug Candidates. Mutation Res.,223, 121-128.
(6)
Aaron C.S., Bolcsfoldi G., Glatt H.R., Moore M., Nishi Y., Stankowski L., Theiss J. and Thompson E. (1994). Mammalian Cell Gene Mutation Assays Working Group Report. Report of the International Workshop on Standardisation of Genotoxicity Test Procedures. Mutation Res.,312, 235-239.
(7)
Li A.P., Gupta R.S., Heflich R.H. and Wasson J. S. (1988). A Review and Analysis of the Chinese Hamster Ovary/Hypoxanthine Guanine Phosphoribosyl Transferase System to Determine the Mutagenicity of Chemical Agents: A Report of Phase III of the U.S. Environmental Protection Agency Gene-tox Program.Mutation Res., 196, 17-36.
(8)
Scott D., Galloway S.M., Marshall R.R., Ishidate M., Brusick D., Ashby J. and Myhr B.C. (1991). Genotoxicity Under Extreme Culture Conditions. A Report from ICPEMC Task Group 9. Mutation Res., 257, 147-204.
(9)
Morita T., Nagaki T., Fukuda I. and Okumura K. (1992). Clastogenicity of Low pH to Various Cultured Mammalian Cells. Mutation Res., 268, 297-305.
(10)
Brusick D. (1986). Genotoxic Effects in Cultured Mammalian Cells Produced by Low pH Treatment Conditions and Increased Ion concentrations, Environ. Mutagen., 8, 789-886.
(11)
Nesslany F., Simar-Meintieres S., Watzinger M., Talahari I. and Marzin D. (2008). Characterization of the Genotoxicity of Nitrilotriacetic Acid. Environ. Mol. Mutation Res., 49, 439-452.
(12)
Long L.H., Kirkland D., Whitwell J. and Halliwell B. (2007). Different Cytotoxic and Clastogenic Effects of Epigallocatechin Gallate in Various Cell-Culture Media Due to Variable Rates of its Oxidation in the Culture Medium, Mutation Res., 634, 177-183.
(13)
Kirkland D., Aardema M., Henderson L., and Müller L. (2005). Evaluation of the Ability of a Battery of Three In Vitro Genotoxicity Tests to Discriminate Rodent Carcinogens and Non-Carcinogens. I: Sensitivity, Specificity and Relative Predictivity. Mutation Res., 5841–256.
(14)
Li A.P., Carver J.H., Choy W.N., Hsie A.W., Gupta R.S., Loveday K.S., O'Neill J.P., Riddle J.C., Stankowski L.F. Jr. and Yang L.L. (1987). A Guide for the Performance of the Chinese Hamster Ovary Cell/Hypoxanthine-Guanine Phosphoribosyl Transferase Gene Mutation Assay. Mutation Res., 189, 135-141.
(15)
Liber H.L., Yandell D.W. and Little J.B. (1989). A Comparison of Mutation Induction at the TK and HPRT Loci in Human Lymphoblastoid Cells; Quantitative Differences are Due to an Additional Class of Mutations at the Autosomal TK Locus. Mutation Res., 216, 9-17.
(16)
Stankowski L.F. Jr., Tindall K.R. and Hsie A.W. (1986). Quantitative and Molecular Analyses of Ethyl Methanesulfonate- and ICR 191-Induced Molecular Analyses of Ethyl Methanesulfonate- and ICR 191-Induced Mutation in AS52 Cells. Mutation Res., 160, 133-147.
(17)
Arlett C.F., Smith D.M., Clarke G.M., Green M.H.L., Cole J., McGregor D.B. and Asquith J.C. (1989). Mammalian Cell Gene Mutation Assays Based upon Colony Formation. In: Statistical Evaluation of Mutagenicity Test Data, Kirkland, D.J. (Eds.), CambridgeUniversity Press, pp. 66-101.
(18)
Hsie A.W., Casciano D.A., Couch D.B., Krahn D.F., O’Neill J.P., and Whitfield B.L. (1981). The Use of Chinese Hamster Ovary Cells to Quantify Specific Locus Mutation and to Determine Mutagenicity of Chemicals; a Report of the Gene-Tox Program, Mutation Res., 86, 193-214.
(19)
Li A.P. (1981). Simplification of the CHO/HGPRT Mutation Assay Through the Growth of Chinese Hamster Ovary Cells as Unattached Cultures, Mutation Res., 85, 165-175.
(20)
Tindall K.R., Stankowski Jr., L.F., Machanoff R., and Hsie A.W. (1984). Detection of Deletion Mutations in pSV2gpt-Transformed Cells, Mol. Cell. Biol., 4, 1411-1415.
(21)
Hsie A. W., Recio L., Katz D. S., Lee C. Q., Wagner M., and Schenley R. L. (1986). Evidence for Reactive Oxygen Species Inducing Mutations in Mammalian Cells. Proc Natl Acad Sci., 83(24): 9616–9620.
(22)
Lorge E., Moore M., Clements J., Donovan M. O., Honma M., Kohara A., Van Benthem J., Galloway S., Armstrong M.J., Thybaud V., Gollapudi B., Aardema M., Kim J., Sutter A., Kirkland D.J. (2015). Standardized Cell Sources and Recommendations for Good Cell Culture Practices in Genotoxicity Testing. (Manuscript in preparation).
(23)
Coecke S., Balls M., Bowe G., Davis J., Gstraunthaler G., Hartung T., Hay R., Merten O.W., Price A., Schechtman L., Stacey G. and Stokes W. (2005). Guidance on Good Cell Culture Practice. A Report of the Second ECVAM Task Force on Good Cell Culture Practice, ATLA, 33, 261-287.
(24)
Rosen M.P., San R.H.C. and Stich H.F. (1980). Mutagenic Activity of Ascorbate in Mammalian Cell Cultures, Can. Lett. 8, 299-305.
(25)
Natarajan A.T., Tates A.D, Van Buul P.P.W., Meijers M. and de Vogel N. (1976). Cytogenetic Effects of Mutagens/Carcinogens after Activation in a Microsomal System In Vitro, I. Induction of Chromosomal Aberrations and Sister Chromatid Exchanges by Diethylnitrosamine (DEN) and Dimethylnitrosamine (DMN) in CHO Cells in the Presence of Rat-Liver Microsomes. Mutation Res., 37, 83-90.
(26)
Abbondandolo A., Bonatti S., Corti G., Fiorio R., Loprieno N. and Mazzaccaro A. (1977). Induction of 6-Thioguanine-Resistant Mutants in V79 Chinese Hamster Cells by Mouse-Liver Microsome-Activated Dimethylnitrosamine. Mutation Res., 46, 365-373.
(27)
Ames B.N., McCann J. and Yamasaki E. (1975). Methods for Detecting Carcinogens and Mutagens with the Salmonella/Mammalian-Microsome Mutagenicity Test. Mutation Res., 31, 347-364.
(28)
Maron D.M. and Ames B.N. (1983). Revised Methods for the Salmonella Mutagenicity Test. Mutation Res., 113, 173, 215.
(29)
Elliott B.M., Combes R.D., Elcombe C.R., Gatehouse D.G., Gibson G.G., Mackay J.M. and Wolf R.C. (1992) Alternatives to Aroclor 1254-Induced S9 in In Vitro Genotoxicity Assays. Mutagen. 7, 175-177.
(30)
Matsushima T., Sawamura M., Hara K. and Sugimura T. (1976). A Safe Substitute for Polychlorinated Biphenyls as an Inducer of Metabolic Activation Systems. In: In Vitro Metabolic Activation in Mutagenesis Testing, de Serres F.J., Fouts J.R., Bend J.R. and Philpot R.M. (Eds), Elsevier, North-Holland, pp. 85-88.
(31)
Ong T.-m., Mukhtar M., Wolf C.R. and Zeiger E. (1980). Differential Effects of Cytochrome P450-Inducers on Promutagen Activation Capabilities and Enzymatic Activities of S-9 from Rat Liver, J. Environ. Pathol. Toxicol., 4, 55-65.
(32)
Johnson T.E., Umbenhauer D.R. and Galloway S.M. (1996). Human Liver S-9 Metabolic Activation: Proficiency in Cytogenetic Assays and Comparison with Phenobarbital/beta-Naphthoflavone or Aroclor 1254 Induced Rat S-9, Environ. Mol. Mutagen., 28, 51-59.
(33)
UNEP. (2001). Stockholm Convention on Persistent Organic Pollutants, United Nations Environment Programme (UNEP). Available at: [http://www.pops.int.html].
(34)
Tan E.-L. and Hsie A.W. (1981). Effect of Calcium Phosphate and Alumina Cγ Gels on the Mutagenicity and Cytotoxicity of Dimethylnitrosamine as Studied in the CHO/HGPRT System. Mutation Res., 84, 147-156.
(35)
O’Neill J.P., Machanoff R., San Sebastian J.R., Hsie A.W. (1982). Cytotoxicity and Mutagenicity of Dimethylnitrosamine in Cammalian Cells (CHO/HGPRT system): Enhancement by Calcium Phosphate. Environ. Mol. Mutation., 4, 7-18.
(36)
Li, A.P. (1984). Use of Aroclor 1254-Induced Rat Liver Homogenate in the Assaying of Promutagens in Chinese Hamster Ovary Cells. Environ. Mol. Mutation, 4, 7-18.
(37)
Krahn D.F., Barsky F.C. and McCooey K.T. (1982). CHO/HGPRT Mutation Assay: Evaluation of Gases and Volatile Liquids. In: Tice, R.R., Costa, D.L., Schaich, K.M. (eds.) Genotoxic Effects of Airborne Agents. New York, Plenum, pp. 91-103.
(38)
Zamora P.O., Benson J.M., Li A.P. and Brooks A.L. (1983). Evaluation of an Exposure System Using Cells Grown on Collagen Gels for Detecting Highly Volatile Mutagens in the CHO/HGPRT Mutation Assay. Environ. Mutagen.,5, 795-801.
(39)
OECD (2014). Document Supporting the WNT Decision to Implement Revised Criteria for the Selection of the Top Concentration in the In Vitro Mammalian Cell Assays on Genotoxicity (Test Guidelines 473, 476 and 487). Available upon request from the Organisation for Economic Cooperation and Development.
(40)
Brookmire L., Chen J.J. and Levy D.D. (2013). Evaluation of the Highest Concentrations Used in the In Vitro Chromosome Aberrations Assay, Environ. Mol. Mutation, 54, 36-43.
(41)
EPA, Office of Chemical Safety and Pollution Prevention. (2011). Chemical Substances of Unknown or Variable Composition, Complex Reaction Products and Biological Materials: UVCB Substances,
(42)
USFDA (2012). International Conference on Harmonisation (ICH) Guidance S2 (R1) on Genotoxicity Testing and Data Interpretation for Pharmaceuticals Intended for Human Use. Available at: [https://federalregister.gov/a/2012-13774].
(43)
O’Neill J.P. and Hsie A.W. (1979). Phenotypic Expression Time of Mutagen-Induced 6-thioguranine resistance in Chinese hamster ovary cells (CHO/HGPRT system), Mutation, Res., 59, 109-118.
(44)
Chiewchanwit T., Ma H., El Zein R., Hallberg L., and Au W.W. (1995). Induction of Deletion Mutations by Methoxyacetaldehyde in Chinese Hamster Ovary (CHO)-AS52 cells. Mutation, Res., 1335(2):121-8.
(45)
Hayashi M., Dearfield K., Kasper P., Lovell D., Martus H.J., and Thybaud V. (2011). Compilation and Use of Genetic Toxicity Historical Control Data, Mutation,Res., 723, 87-90.
(46)
OECD (2014). Statistical Analysis Supporting the Revision of the Genotoxicity Test Guidelines. Environmental, Health and Safety, Series on testing and assessment (No 199), Organisation for Economic Cooperation and Development, Paris.
(47)
Richardson C., Williams D.A., Allen J.A., Amphlett G., Chanter D.O., and Phillips B. (1989). Analysis of Data from In Vitro Cytogenetic Assays. In: Statistical Evaluation of Mutagenicity Test Data. Kirkland, D.J., (Ed) Cambridge University Press, Cambridge, pp. 141-154.
(48)
Fleiss J. L., Levin B., and Paik M. C. (2003). Statistical Methods for Rates and Proportions, Third Edition, New York: John Wiley & Sons
Appendix 1
DEFINITIONS
Base pair substitution mutagens: chemicals that cause substitution of base pairs in the DNA.
Chemical: A substance or a mixture.
Cloning efficiency: The percentage of cells plated at a low density that are able to grow into a colony that can be counted.
Concentrations: refer to final concentrations of the test chemical in culture medium
Cytotoxicity: For the assays covered in this test method, cytotoxicity is identified as a reduction in relative survival of the treated cells as compared to the negative control (see specific paragraph).
Forward mutation: a gene mutation from the parental type to the mutant form which gives rise to an alteration or a loss of the enzymatic activity or the function of the encoded protein.
Frameshift mutagens: chemicals which cause the addition or deletion of single or multiple base pairs in the DNA molecule.
Genotoxic: a general term encompassing all types of DNA or chromosomal damage, including DNA breaks, adducts, rearrangements, mutations, chromosome aberrations, and aneuploidy. Not all types of genotoxic effects result in mutations or stable chromosomal damage
HAT medium: medium containing Hypoxanthine, Aminopterin and Thymidine, used for cleansing of Hprt mutants.
Mitotic recombination: during mitosis, recombination between homologous chromatids possibly resulting in the induction of DNA double strand breaks or in a loss of heterozygosity.
MPA medium: medium containing Xanthine, Adenine, Thymidine, Aminopterin and Mycophenolic acid, used for cleansing of Xprt mutants.
Mutagenic: produces a heritable change of DNA base-pair sequences(s) in genes or of the structure of chromosomes (chromosome aberrations).
Mutant frequency (MF): the number of mutant colonies observed divided by the number of cells plated in selective medium, corrected for cloning efficiency (or viability) at the time of selection.
Phenotypic expression time: The time after treatment during which the genetic alteration is fixed within the genome and any preexisting gene products are depleted to the point that the phenotypic trait is altered.
Relative survival (RS): RS is used as the measure of treatment-related cytotoxicity. RS is cloning efficiency (CE) of cells plated immediately after treatment adjusted by any loss of cells during treatment as compared with cloning efficiency in negative controls (assigned a survival of 100 %).
S9 liver fractions: supernatant of liver homogenate after 9 000g centrifugation, i.e. raw liver extract
S9 mix: mix of the liver S9 fraction and cofactors necessary for metabolic enzyme activity.
Solvent control: General term to define the control cultures receiving the solvent alone used to dissolve the test chemical.
Test chemical: Any substance or mixture tested using this test method.
Untreated control: cultures that receive no treatment (i.e. neither test chemical nor solvent) but are processed concurrently and in the same way as the cultures receiving the test chemical
UVCB: Chemical Substances of Unknown or Variable Composition, Complex Reaction Products and Biological Materials
Appendix 2
FORMULAS FOR ASSESSMENT OF CYTOTOXICITY AND MUTANT FREQUENCY
Cytotoxicity is evaluated by relative survival, i. e., cloning efficiency (CE) of cells plated immediately after treatment adjusted by any loss of cells during treatment as compared with adjusted cloning efficiency in negative controls (assigned a survival of 100 %) (see RS formula below).
Adjusted CE for a culture treated by a test chemical is calculated as:
RS for a culture treated by a test chemical is calculated as:
Mutant frequency is the cloning efficiency of mutant colonies in selective medium divided by the cloning efficiency in non-selective medium measured for the same culture at the time of selection.
When plates are used for cloning efficiency:
CE = Number of colonies / Number of cells plated.
When micro-well plates are used for cloning efficiency:
The number of colonies per well on micro-wells plates follows a Poisson distribution.
Cloning Efficiency = -LnP(0) / Number of cells plated per well
Where -LnP(0) is the probable number of empty wells out of the seeded wells and is described by the following formula
LnP(0) = -Ln (number of empty wells / number of plated wells)
"
(3)
In Part B, Chapter B.22 is replaced by the following:
"B.22 RODENT DOMINANT LETHAL TEST
INTRODUCTION
1.
This test method (TM) is equivalent to the OECD test guideline (TG) 478 (2016). Test methods are periodically reviewed in the light of scientific progress, changing regulatory needs, and animal welfare considerations. This modified version of the test method reflects more than thirty years of experience with this test and the potential for integrating or combining this test with other toxicity tests such as developmental, reproductive toxicity, or genotoxicity studies; however due to its limitations and the use of a large number of animals this assay is not intended for use as a primary method, but rather as a supplemental test method which can only be used when there is no alternative for regulatory requirements. Combining toxicity testing has the potential to spare large numbers of animals from use in toxicity tests. A document that provides succinct information on genetic toxicology testing and an overview of the recent changes that were made to genetic toxicity OECD test guidelines has been developed by OECD (1).
2.
The purpose of the Dominant lethal (DL) test is to investigate whether chemicals produce mutations resulting from chromosomal aberrations in germ cells. In addition, the dominant lethal test is relevant to assessing genotoxicity because, although they may vary among species, factors of in vivo metabolism, pharmacokinetics and DNA-repair processes are active and contribute to the response. Induction of a DL mutation after exposure to a test chemical indicates that the chemical has affected germinal tissue of the test animal.
3.
DL mutations cause embryonic or foetal death. Induction of DL mutation after exposure to a test chemical indicates that the chemical has affected the germ cells of the test animal.
4.
A DL assay is useful for confirmation of positive results of tests using somatic in vivo endpoints, and is a relevant endpoint for the prediction of human hazard and risk of genetic diseases transmitted through the germline. However, this assay requires a large number of animals and is labour-intensive; as a result, it is very expensive and time-consuming to conduct. Because the spontaneous frequency of dominant lethal mutations is quite high, the sensitivity of the assay for detection of small increases in the frequency of mutations is generally limited.
5.
Definitions of key terms are set out in Appendix 1.
INITIAL CONSIDERATIONS
6.
The test is most often conducted in mice (2) (3) (4) but other species, such as rats (5) (6) (7) (8), may in some cases be appropriate if scientifically justified. DLs generally are the result of gross chromosomal aberrations (structural and numerical abnormalities) (9) (10) (11), but gene mutations cannot be excluded. A DL mutation is a mutation occurring in a germ cell per se, or is fixed post fertilisation in the early embryo, that does not cause dysfunction of the gamete, but is lethal to the fertilised egg or developing embryo.
7.
Individual males are mated sequentially to virgin females at appropriate intervals. The number of matings following treatment is dependent on the ultimate purpose of the DL study (Paragraph 23) and should ensure that all phases of male germ cell maturation are evaluated for DLs (12).
8.
If there is evidence that the test chemical, or its metabolite(s), will not reach the testis, it is not appropriate to use this test.
PRINCIPLE OF THE TEST
9.
Generally, male animals are exposed to a test chemical by an appropriate route of exposure and mated to untreated virgin females. Different germ cell types can be tested by the use of sequential mating intervals. Following mating, the females are euthanised after an appropriate period of time, and their uteri are examined to determine the numbers of implants and live and dead embryos. The dominant lethality of a test chemical is determined by comparing the live implants per female in the treated group with the live implants per female in the vehicle/solvent control group. The increase of dead implants per female in the treated group over the dead implants per female in the control group reflects the test-chemical-induced post-implantation loss. The post-implantation loss is calculated by determining the ratio of dead to total implants in the treated group compared to the ratio of dead to total implants in the control group. Pre-implantation loss can be estimated by comparing corpora lutea counts minus total implants or the total implants per female in treated and control groups.
VERIFICATION OF LABORATORY PROFICIENCY
10.
Competence in this assay should be established by demonstrating the ability to reproduce dominant lethal frequencies from published data (e.g. (13) (14) (15) (16) (17) (18)) with positive control substances (including weak responses) such as those listed in Table 1, and vehicle controls and obtaining negative control frequencies that are consistent acceptable range of data (see references above) or with the laboratory’s historical control distribution, if available.
DESCRIPTION OF THE METHOD
Preparations
Selection of animal species
11.
Commonly used laboratory strains of healthy sexually mature animals should be employed. Mice are commonly used but rats may also be appropriate. Any other appropriate mammalian species may be used, if scientific justification is provided in the report.
Animal housing and feeding conditions
12.
For rodents, the temperature in the animal room should be 22 °C (± 3 °C). Although the relative humidity ideally should be 50-60 %, it should be at least 40 % and preferably not exceed 70 %, other than during room cleaning. Lighting should be artificial, the sequence being 12 hours light, followed by 12 hours dark. For feeding, conventional laboratory diets may be used with an unlimited supply of drinking water. The choice of diet may be influenced by the need to ensure a suitable admixture of a test chemical when administered by this route. Prior to treatment or mating, rodents should be housed in small groups (no more than five) of the same sex if no aggressive behaviour is expected or observed, preferably in solid cages with appropriate environmental enrichment. Animals may be housed individually if scientifically justified.
Preparation of the animals
13.
Healthy and sexually mature male and female adult animals are randomly assigned to the control and treatment groups. The individual animals are identified uniquely using a humane, minimally invasive method (e.g. by ringing, tagging, micro-chipping, or biometric identification, but not toe and ear clipping) and acclimated to the laboratory conditions for at least five days. Cages should be arranged in such a way that possible effects due to cage placement are minimised. Cross contamination by the positive control and the test chemical should be avoided. At the commencement of the study, the weight variation of animals should be minimal and not exceed ± 20 % of the mean weight of each sex.
Preparation of doses
14.
Solid test chemicals should be dissolved or suspended in appropriate solvents or vehicles or admixed in diet or drinking water prior to dosing of the animals. Liquid test chemicals may be dosed directly or diluted prior to dosing. For inhalation exposures, test chemicals can be administered as gas, vapour, or a solid/liquid aerosol, depending on their physicochemical properties. Fresh preparations of the test chemical should be employed unless stability data demonstrate the acceptability of storage and define the appropriate storage conditions.
Test Conditions
Solvent/vehicle
15.
The solvent/vehicle should not produce toxic effects at the dose volumes used, and should not be suspected of chemical reaction with the test chemical. If other than well-known solvents/vehicles are used, their inclusion should be supported with reference data indicating their compatibility. It is recommended that wherever possible, the use of an aqueous solvent/vehicle should be considered first. Examples of commonly used compatible solvents/vehicles include water, physiological saline, methylcellulose solution, carboxymethyl cellulose sodium salt solution, olive oil and corn oil.
Positive controls
16.
Concurrent positive control animals should always be used unless the laboratory has demonstrated proficiency in the conduct of the test and has used the test routinely in the recent past (e.g. within the last 5 years). However, it is not necessary to treat positive control animals by the same route as animals receiving the test chemical, or sample all the mating intervals. The positive control substances should be known to produce DLs under the conditions used for the test. Except for the treatment, animals in the control groups should be handled in an identical manner to animals in the treated groups.
17.
The doses of the positive control substances should be selected so as to produce weak or moderate effects that critically assess the performance and sensitivity of the assay, but which consistently produce positive dominant lethal effects. Examples of positive control substances, and appropriate doses, are included in Table 1.
Table 1
Examples of Positive Control Substances.
Substance [CAS no.]
(reference no.)
Effective Dose range (mg/kg)
(rodent species)
Administration Time (days)
Triethylenemelamine [51-18-3] (15)
0,25 (mice)
1
Cyclophosphamide [50-18-0] (19)
50-150 (mice)
5
Cyclophosphamide [50-18-0] (5)
25-100 (rats)
1
Ethyl methanesulphonate [62-50-0] (13)
100-300 (mice)
5
Monomeric Acrylamide [79-06-1] (17)
50 (mice)
5
Chlorambucil [305-03-3] (14)
25 (mice)
1
Negative controls
18.
Negative control animals, treated with solvent or vehicle alone, and otherwise treated in the same way as the treatment groups, should be included for every sampling time (20). In the absence of historical or published control data showing that no DLs or other deleterious effects are induced by the chosen solvent/vehicle, untreated control animals should also be included for every sampling time in order to establish acceptability of the vehicle control.
PROCEDURE
Number of Animals
19.
Individual males are mated sequentially at appropriate predetermined intervals (e.g. weekly intervals, Paragraphs 21 & 23) preferably to one virgin female. The number of males per group should be predetermined to be sufficient (in combination with the number of mated females at each mating interval) to provide the statistical power necessary to detect at least a doubling in DL frequency (Paragraph 44).
20.
The number of females per mating interval should also be predetermined by statistical power calculations to permit the detection of at least a doubling in the DL frequency (i.e. sufficient pregnant females to provide at least 400 total implants) (20) (21) (22) (23) and that at least one dead implant per analysis unit (i.e. mating group per dose) is expected (24).
Administration Period and Mating Intervals
21.
The number of mating intervals following treatment is governed by the treatment schedule and should ensure that all phases of male germ cell maturation are evaluated for DL induction (12) (25). For a single treatment up to five daily dose administrations, there should be 8 (mouse) or 10 (rat) matings conducted at weekly intervals following the last treatment. For multiple dose administrations, the number of mating intervals may be reduced in proportion to the increased time of the administration period, but maintaining the goal of evaluating all phases of spermatogenesis (e.g. after a 28-day exposure, only 4 weekly matings are sufficient to evaluate all phased of spermatogenesis in the mouse). All treatment and mating schedules should be scientifically justified.
22.
Females should remain with the males for at least the duration of one oestrus cycle (e.g. one week covers one oestrus cycle in both mice and rats). Females that did not mate during a one-week interval can be used for a subsequent mating interval. Alternatively, until mating has occurred, as determined by the presence of sperm in the vagina or by the presence of a vaginal plug.
23.
The exposure and mating regimen used is dependent on the ultimate purpose of the DL study. If the goal is to determine whether a given chemical induces DL mutations per se, then the accepted method would be to expose an entire round of spermatogenesis (e.g. 7 weeks in the mouse, 5-7 treatments per week) and mate once at the end. However, if the goal is to identify the sensitive germ cell type for DL induction, then a single or 5 day exposure followed by weekly mating is preferred.
Dose Levels
24.
If a preliminary range-finding study is performed because there are no suitable data already available to aid in dose selection, it should be performed in the same laboratory, using the same species, strain, sex, and treatment regimen to be used in the main study (26). The study should aim to identify the maximum tolerated dose (MTD), defined as the highest dose that will be tolerated without evidence of study-limiting toxicity, relative to the duration of the study period (for example, abnormal behaviour or reactions, minor body weight depression or hematopoietic system cytotoxicity), but not death or evidence of pain, suffering or distress necessitating humane euthanasia (27).
25.
The MTD must also not adversely affect mating success (21).
26.
Test chemicals with specific biological activities at low non-toxic doses (such as hormones and mitogens), and chemicals which exhibit saturation of toxicokinetic properties may be exceptions to the dose-setting criteria and should be evaluated on a case-by-case basis.
27.
In order to obtain dose response information, a complete study should include a negative control group and a minimum of three dose levels generally separated by a factor of 2, but not greater than 4. If the test chemical does not produce toxicity in a range-finding study, or based on existing data, the highest dose for a single administration should be 2 000 mg/kg body weight. However, if the test chemical does cause toxicity, the MTD should be the highest dose administered and the dose levels used should preferable cover a range from the maximum to a dose producing little or no toxicity. For not-toxic chemicals, the limit dose for an administration period of 14 days or more is 1 000 mg/kg body weight/day, and for administration periods of less than 14 days the limit dose is 2 000 mg/kg body weight/day.
Administration of Doses
28.
The anticipated route of human exposure should be considered when designing an assay. Therefore, routes of exposures such as dietary, drinking water, subcutaneous, intravenous, topical, inhalation, oral (by gavage), or implantation may be chosen as justified. In any case, the route should be chosen to ensure adequate exposure of the target tissue(s). Intraperitoneal injection is not normally recommended since it is not an intended route of human exposure, and should only be used with specific scientific justification. If the test chemical is admixed in diet or drinking water, especially in case of single dosing, care should be taken that the delay between food and water consumption and mating should be sufficient to allow detection of the effects (paragraph 31). The maximum volume of liquid that can be administered by gavage or injection at one time depends on the size of the test animal. The volume should not normally exceed 1 ml/100g body weight except in the case of aqueous solutions where a maximum of 2 ml/100g may be used. The use of volumes greater than this (if permitted by animal welfare legislation) should be justified. Variability in test volume should be minimised by adjusting the concentration to ensure a constant volume in relation to body weight at all dose levels.
Observations
29.
General clinical observations of the test animals should be made and clinical signs recorded at least once a day, preferably at the same time(s) each day and considering the peak period of anticipated effects after dosing. At least twice daily during the dosing period, all animals should be observed for morbidity and mortality. All animals should be weighed at the beginning of the study and at least once a week during repeated dose studies, and at the time of euthanasia. Measurements of food consumption should be made at least weekly. If the test chemical is administered via the drinking water, water consumption should be measured at each change of water and at least weekly. Animals exhibiting non-lethal indicators of excess toxicity should be euthanised prior to completion of the test period (27).
Tissue Collection and Processing
30.
Females are euthanised in the second half of pregnancy at gestation day (GD) 13 for mice and GD 14-15 for rats. Uteri are examined for dominant lethal effects to determine the number of implants, live and dead embryos, and corpora lutea.
31.
The uterine horns and ovaries are exposed for counting of corpora lutea, and fetuses are removed, counted, and weighted. Care should be taken to examine the uteri for resorptions obscured by live fetuses and to ensure that all resorptions are enumerated. Fetal mortality is recorded. The number of successfully impregnated females and the number of total implantations, pre-implantation losses, and post-implantation mortality (included early and late resorptions) also are recorded. In addition, the visible fetuses may be preserved in Bouin’s fixative for at least 2 weeks followed by examination for major external malformations (28) to provide additional information on the reproductive and developmental effects of the test agent.
DATA AND REPORTING
Treatment of Results
32.
Data should be tabulated to show the number of males mated, the number of pregnant females, and the number of non-pregnant females. Results of each mating, including the identity of each male and female, should be reported individually. The mating interval, dose level for treated males, and the numbers of live implants and dead implants should be enumerated for each female.
33.
The post-implantation loss is calculated by determining the ratio of dead to total implants from the treated group compared to the ratio of dead to total implants from the vehicle/solvent control group.
34.
Pre-implantation loss is calculated as the difference between the number of corpora lutea and the number of implants, or as a reduction in the average number of implants per female in comparison with control matings. Where pre-implantation loss is estimated, it should be reported.
35.
The Dominant Lethal factor is estimated as: (post-implantation deaths/total implantations per female) × 100.
36.
Data on toxicity and clinical signs (as per Paragraph 29) should be reported.
Acceptability Criteria
37.
The following criteria determine the acceptability of a test.
—
Concurrent negative control is consistent with published norms for historical negative control data, and the laboratory's historical control data if available (see Paragraphs 10 and 18).
—
Concurrent positive controls induce responses that are consistent with published norms for historic positive control data, or the laboratory’s historical positive control database, if available, and produce a statistically significant increase compared with the negative control (see Paragraphs 17 and 18).
—
Adequate number total implants and doses have been analysed (Paragraph 20).
—
The criteria for the selection of top dose are consistent with those described in Paragraphs 24 and 27.
Evaluation and Interpretation of Results
38.
At least three treated dose groups should be analysed in order to provide sufficient data for dose-response analysis.
39.
Providing that all acceptability criteria are fulfilled, a test chemical is considered a clear positive if:
—
at least one of the test doses exhibits a statistically significant increase compared with the concurrent negative control;
—
the increase is dose-related in at least one experimental condition (e.g. a weekly mating interval) when evaluated with an appropriate test; and,
—
any of the results are outside of the acceptable range of negative control data, or the distribution of the laboratory’s historical negative control data (e.g. Poisson-based 95 % control limit) if available.
The test chemical is then considered able to induce dominant lethal mutations in germ cells of the test animals. Recommendations for the most appropriate statistical methods are described in Paragraph 44; other recommend statistical approaches can also be found in the literature (20) (21) (22) (24) (29). Statistical tests used should consider the animal as the experimental unit.
40.
Providing that all acceptability criteria are fulfilled, a test chemical is considered a clear negative if:
—
none of the test doses exhibits a statistically significant increase compared with the concurrent negative control;
—
there is no dose-related increase in any experimental condition; and
—
all results are within acceptable range of negative control data, or the laboratory’s historical negative control data (e.g. Poisson-based 95 % control limit), if available.
The test chemical is then considered unable to induce dominant lethal mutations in germ cells of the test animals.
41.
There is no requirement for verification of a clear positive or a clear negative response.
42.
If the response is not clearly negative or positive, and in order to assist in establishing the biological relevance of a result (e.g. a weak or borderline increase), the data should be evaluated by expert judgment and/or further investigations using the existing experimental data, such as consideration whether the positive result is outside the acceptable range of negative control data, or the laboratory's historical, negative control data (30).
43.
In rare cases, even after further investigations, the data set will preclude making a conclusion of positive or negative results, and will therefore be concluded as equivocal.
44.
Statistical tests used should consider the male animal as the experimental unit. While it is possible that count data (e.g. number of implants per female) may be Poisson distributed and/or proportions (e.g. proportion of dead implants) may be binomially distributed, it is often the case that such data are overdispersed (31). Accordingly, statistical analysis should first employ a test for over- underdispersion using variance tests such as Cochran’s binomial variance test (32) or Tarone’s C(α) test for binomial overdispersion (31) (33). If no departure from binomial dispersion is detected, trends in proportions across dose levels may be tested using the Cochran-Armitage trend test (34) and pairwise comparisons with the control group may be tested using Fisher’s exact test (35). Likewise, if no departure from Poisson dispersion is detected, trends in counts may be tested using Poisson regression (36) and pairwise comparisons with the control group may be tested within the context of the Poisson model, using pairwise contrasts (36). If significant overdispersion or underdispersion is detected, nonparametric methods are recommended (23) (31). These include rank-based tests, such as the Jonckheere-Terpstra test for trend (37) and Mann-Whitney tests (38) for pairwise comparisons with the vehicle/solvent control group, as well as permutation, resampling, or bootstrap tests for trend and pairwise comparisons with the control group (31) (39).
45.
A positive DL assay provides evidence for the genotoxicity of the test chemical in the germ cells of the treated male of the test species.
46.
Consideration of whether the observed values are within or outside of the historical control range can provide guidance when evaluating the biological significance of the response (40).
Test Report
47.
The test report should include the following information.
Summary.
Test chemical:
—
source, lot number, limit date for use, if available;
—
stability of the test chemical itself, if known;
—
solubility and stability of the test chemical in solvent, if known;
—
measurement of pH, osmolality, and precipitate in the culture medium to which the test chemical was added, as appropriate.
Mono-constituent substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents.
Test chemical preparation:
—
justification for choice of vehicle;
—
solubility and stability of the test chemical in the solvent/vehicle, if known;
—
preparation of dietary, drinking water or inhalation formulations;
—
analytical determinations on formulations (e.g. stability, homogeneity, nominal concentrations) when conducted.
Test animals:
—
species/strain used and justification for the choice;
—
number, age and sex of animals;
—
source, housing conditions, diet, etc.;
—
method of uniquely identifying the animals;
—
for short-term studies: individual body weight of the male animals at the start and end of the test; for studies longer than one week: individual body weights during the study and food consumption. Body weight range, mean and standard deviation for each group should be included.
Test conditions:
—
positive and negative (vehicle/solvent) control data;
—
data from the range-finding study;
—
rationale for dose level selection;
—
details of test chemical preparation;
—
details of the administration of the test chemical;
—
rationale for route of administration;
—
methods for measurement of animal toxicity, including, where available, histopathological or hematological analyses and the frequency with which animal observations and body weights were taken;
—
methods for verifying that the test chemical reached the target tissue, or general circulation, if negative results are obtained;
—
actual dose (mg/kg body weight/day) calculated from diet/drinking water test chemical concentration (ppm) and consumption, if applicable;
—
details of food and water quality;
—
details on cage environment enrichment;
—
detailed description of treatment and sampling schedules and justifications for the choices;
—
method of analgesia
—
method of euthanasia;
—
procedures for isolating and preserving tissues;
—
source and lot numbers of all kits and reagents (where applicable);
—
methods for enumeration of DLs;
—
mating schedule;
—
methods used to determine that mating has occurred;
—
time of euthanasia;
—
criteria for scoring DL effects, including, corpora lutea, implantations, resorptions and pre-implantation losses, live implants, dead implants.
Results:
—
animal condition prior to and throughout the test period, including signs of toxicity;
—
male body weight during the treatment and mating periods;
—
number of mated females;
—
dose-response relationship, where possible;
—
concurrent and historical negative control data with ranges, means and standard deviations;
—
concurrent positive control data;
—
tabulated data for each dam including: number of corpora lutea per dam; number of implantations per dam; number of resorptions and pre-implantation losses per dam; number of live implants per dam; number of dead implants per dam; fetus weights;
—
the above data summarised for each mating period and dose, with Dominant Lethal frequencies;
—
statistical analyses and methods applied.
Discussion of the results.
Conclusion.
LITERATURE
(1)
OECD (2016). Overview of the set of OECD Genetic Toxicology Test Guidelines and updates performed in 2014-2015. ENV Publications. Series on Testing and Assessment, No 234, OECD, Paris.
(2)
Bateman, A.J. (1977). The Dominant Lethal Assay in the Male Mouse, in Handbook of Mutagenicity Test Procedures B.J. Kilbey et. al.(Eds.) pp. 235-334, Elsevier, Amsterdam
(3)
Ehling U.H., Ehling, U.H., Machemer, L., Buselmaier, E., Dycka, D., Frohberg, H., Kratochvilova, J., Lang, R., Lorke, D., Muller, D., Pheh, J., Rohrborn, G., Roll, R., Schulze-Schencking, M., and Wiemann, H. (1978). Standard Protocol for the Dominant Lethal Test on Male Mice. Set up by the Work Group “Dominant” lethal mutations of the ad hoc Committee Chemogenetics, Arch. Toxicol., 39, 173-185
(4)
Shelby M.D. (1996). Selecting Chemicals and Assays for Assessing Mammalian Germ Cell Mutagenicity. Mutation Res,. 352:159-167.
(5)
Knudsen I., Knudsen, I., Hansen, E.V., Meyer, O.A. and Poulsen, E. (1977). A proposed Method for the Simultaneous Detection of Germ-Cell Mutations Leading to Fetal Death (Dominant Lethality) and of Malformations (Male Teratogenicity) in Mammals. Mutation Res., 48:267-270.
(6)
Anderson D., Hughes, J.A., Edwards, A.J. and Brinkworth, M.H. (1998). A Comparison of Male-Mediated Effects in Rats and Mice Exposed to 1,3-Butadiene. Mutation Res., 397:77-74.
(7)
Shively C.A., C.A., White, D.M., Blauch, J.L. and Tarka, S.M. Jr. (1984). Dominant Lethal Testing of Theobromine in Rats. Toxicol. Lett. 20:325-329.
(8)
Rao K.S., Cobel-Geard, S.R., Young, J.T., Hanley, T.R. Jr., Hayes, W.C., John, J.A. and Miller, R.R. (1983). Ethyl Glycol Monomethyl Ether II. Reproductive and dominant Lethal Studies in Rats. Fundam. Appl. Toxicol., 3:80-85.
(9)
Brewen J.G., Payne, H.S., Jones, K.P., and Preston, R.J. (1975). Studies on Chemically Induced Dominant Lethality. I. The Cytogenetic Basis of MMS-Induced Dominant Lethality in Post-Meiotic Male Germ Cells, Mutation Res., 33, 239-249.
(10)
Marchetti F., Bishop, J.B., Cosentino, L., Moore II, D. and Wyrobek, A.J. (2004). Paternally Transmitted Chromosomal Aberrations in Mouse Zygotes Determine their Embryonic Fate. Biol. Reprod., 70:616-624.
(11)
Marchetti F. and Wyrobek, A.J. (2005). Mechanisms and Consequences of Paternally Transmitted Chromosomal Aberrations. Birth Defects Res., C 75:112-129.
(12)
Adler I.D. (1996). Comparison of the Duration of Spermatogenesis Between Rodents and Humans. Mutation Res., 352:169-172.
(13)
Favor J., and Crenshaw J.W. (1978). EMS-Induced Dominant Lethal Dose Response Curve in DBA/1J Male Mice, Mutation Res., 53: 21–27.
(14)
Generoso W.M., Witt, K.L., Cain, K.T., Hughes, L. Cacheiro, N.L.A, Lockhart, A.M.C. and Shelby, M.D. (1995). Dominant Lethal and Heritable Translocation Test with Chlorambucil and Melphalan. Mutation Res., 345:167-180.
(15)
astings S.E., Huffman K.W. and Gallo M.A. (1976). The dominant Lethal Effect of Dietary Triethylenemelamine, Mutation Res., 40:371-378.
(16)
James D.A. and Smith D.M. (1982). Analysis of Results from a Collaborative Study of the Dominant Lethal Assay, Mutation Res., 99:303-314.
(17)
Shelby M.D., Cain, K.T., Hughes, L.A., Braden, P.W. and Generoso, W.M. (1986). Dominant Lethal Effects of Acrylamide in Male Mice. Mutation Res., 173:35-40.
Holstrom L.M., Palmer A.K. and Favor, J. (1993). The Rodent Dominant Lethal Assay. In Supplementary Mutagenicity Tests. Kirkland D.J. and Fox M. (Eds.), Cambridge University Press, pp. 129-156.
(20)
Adler I-D., Bootman, J., Favor, J., Hook, G., Schriever-Schwemmer, G., Welzl, G., Whorton, E., Yoshimura, I. and Hayashi, M. (1998). Recommendations for Statistical Designs of In Vivo Mutagenicity Tests with Regard to Subsequent Statistical Analysis, Mutation Res., 417:19–30.
(21)
Adler I.D., Shelby M. D., Bootman, J., Favor, J., Generoso, W., Pacchierotti, F., Shibuya, T. and Tanaka N. (1994). International Workshop on Standardisation of Genotoxicity Test Procedures. Summary Report of the Working Group on Mammalian Germ Cell Tests. Mutation Res., 312:313-318.
(22)
Generoso W.M. and Piegorsch W.W. (1993). Dominant Lethal Tests in Male and Female Mice. Methods, Toxicol., 3A:124-141.
(23)
Haseman J.K. and Soares E.R. (1976).The Distribution of Fetal Death in Control Mice and its Implications on Statistical Tests for Dominant Lethal Effects. Mutation. Res., 41: 277-288.
(24)
Whorton E.B. Jr. (1981). Parametric Statistical Methods and Sample Size Considerations for Dominant Lethal Experiments. The Use of Clustering to Achieve Approximate Normality, Teratogen. Carcinogen. Mutagen., 1:353 – 360.
(25)
Anderson D., Anderson, D., Hodge, M.C.E., Palmer, S., and Purchase, I.F.H. (1981). Comparison of Dominant Lethal and Heritable Translocation Methodologies. Mutation. Res., 85:417-429.
(26)
Fielder R. J., Allen, J. A., Boobis, A. R., Botham, P. A., Doe, J., Esdaile, D. J., Gatehouse, D. G., Hodson-Walker, G., Morton, D. B., Kirkland, D. J. and Richold, M. (1992). Report of British Toxicology Society/UK Environmental Mutagen Society Working Group: Dose Setting in In Vivo Mutagenicity Assays. Mutagen., 7:313-319.
(27)
OECD (2000). Guidance Document on the Recognition, Assessment and Use of Clinical Signs as Humane Endpoints for Experimental Animals Used in Safety Evaluation. Environment, Health and Safety Publications, Series on Testing and Assessment (No.19.), Organisation for Economic Cooperation and Development, Paris.
(28)
Barrow M.V., Taylor W.J and Morphol J. (1969). A Rapid Method for Detecting Malformations in Rat Fetuses, 127, 291–306.
(29)
Kirkland D.J., (Ed.)(1989). Statistical Evaluation of Mutagenicity Test Data, Cambridge University Press.
(30)
Hayashi, M., Dearfield, K., Kasper P., Lovell D., Martus H.-J. and Thybaud V. (2011). “Compilation and Use of Genetic Toxicity Historical Control Data”, Mutation. Res., 723:87-90.
(31)
Lockhart A.C., Piegorsch W.W. and Bishop J.B. (1992). Assessing Over Dispersion and Dose-Response in the Male Dominant Lethal Assay. Mutation. Res., 272:35-58.
(32)
Cochran W.G. (1954). Some Methods for Strengthening the Common χ2 Tests. Biometrics, 10: 417-451.
(33)
Tarone R.E. (1979). Testing the Goodness of Fit of the Binomial Distribution. Biometrika, 66: 585-590.
(34)
Margolin B.H. (1988). Test for Trend in Proportions. In Encyclopedia of Statistical Sciences, Volume 9, Kotz S. and Johnson N. L. (Eds.), pp. 334-336. John Wiley and Sons, New York.
(35)
Cox D.R., Analysis of Binary Data. Chapman and Hall, London (1970).
(36)
Neter J.M., Kutner, H.C., Nachtsheim, J. and Wasserman, W. (1996). Applied Linear Statistical Models, Fourth Edition, Chapters 14 and 17. McGraw-Hill, Boston
(37)
Jonckheere R. (1954). A Distribution-Free K-Sample Test Against Ordered Alternatives. Biometrika, 41:133-145.
(38)
Conover W.J. (1971). Practical Nonparametric Statistics. John Wiley and Sons, New York
(39)
Efron, B. (1982). The Jackknife, the Bootstrap and Other Resampling Plans. Society for Industrial and Applied Mathematics, Philadelphia, PA.
(40)
Fleiss J. (1973). Statistical Methods for Rates and Proportions. John Wiley and Sons, New York.
Appendix 1
DEFINITIONS
Chemical: A substance or a mixture
Corpora luteum (lutea): the hormonal secreting structure formed on the overy at the site of a follicle that has released the egg. The number of corpora lutea in the ovaries corresponds to the number of eggs that were ovulated.
Dominant Lethal Mutation: a mutation occurring in a germ cell, or is fixed after fertilisation, that causes embryonic or foetal death.
Fertility rate: the number of mated pregnant female over the number of mated females.
Mating interval: the time between the end of exposure and mating of treated males. By controlling this interval, chemical effects on different germ cell types can be assessed. In the mouse mating during the 1, 2, 3, 4, 5, 6, 7 and 8 week after the end of exposure measures effects in sperm, condensed spermatids, round spermatids, pachytene spermatocytes, early spermatocytes, differentiated spermatogonia, differentiating spermatogonia and stem cell spermatogonia.
Preimplantation loss: the difference between the number of implants and the number of corpora lutea. It can also be estimated by comparing the total implants per female in treated and control groups.
Postimplantation loss: the ratio of dead implant in the treated group compared to the ratio of dead to total implants in the control group.
Test chemical: Any substance or mixture tested using this test method.
UVCB: Chemical Substance of Unknown or Variable Composition, Complex Reaction Products and Biological Materials
Appendix 2
TIMING OF SPERMATOGENESIS IN MAMMALS
Fig.1: Comparison of the duration (days) of male germ cell development in mice, rats and humans. DNA repair does not occur during the periods indicated by shading.
A schematic of spermatogenesis in the mouse, rat and human is shown above (taken from Adler, 1996). Undifferentiated spermatogonia include: A-single; A-paired; and A-aligned spermatogonia (Hess and de Franca, 2008). A-single is considered the true stem cells; therefore, to assess effects on stem cells at least 49 days (in the mouse) must pass between the last injection of the test chemical and mating.
References
Adler, ID (1996). Comparison of the duration of spermatogenesis between rodents and humans. Mutat Res, 352:169-172.
Hess, RA, De Franca LR (2008). Spermatogenesis and cycle of the seminiferous epithelium. In: Molecular Mechanisms in Spermatogenesis, C. Yan Cheng (Ed), Landes Biosciences and Springer Science&Business Media:1-15.
"
(4)
In Part B, Chapter B.23 is replaced by the following:
"B.23 MAMMALIAN SPERMATOGONIAL CHROMOSOMAL ABERRATION TEST
INTRODUCTION
1.
This test method (TM) is equivalent to the OECD test guideline 483 (2016). Test methods are periodically reviewed in the light of scientific progress, changing regulatory needs, and animal welfare considerations. This modified version of the test method reflects many years of experience with this assay and the potential for integrating or combining this test with other toxicity or genotoxicity studies. Combining toxicity studies has the potential to reduce the numbers of animals used in toxicity testing. This test method is part of a series of test methods on genetic toxicology. A document that provides succinct information on genetic toxicology testing and an overview of the recent changes that were made to genetic toxicity OECD test guidelines has been developed by OECD (1).
2.
The purpose of the in vivo mammalian spermatogonial chromosomal aberration test is to identify those chemicals that cause structural chromosomal aberrations in mammalian spermatogonial cells (2) (3) (4). In addition, this test is relevant to assessing genetoxicity because, although they may vary among species, factors of in vivo metabolism, pharmacokinetics and DNA-repair processes are active and contribute to the response. This test method is not designed to measure numerical abnormalities; the assay is not routinely used for this purpose.
3.
This test measures structural chromosomal aberrations (both chromosome- and chromatid-type) in dividing spermatogonial germ cells and is, therefore, expected to be predictive of induction of heritable mutations in these germ cells.
4.
Definitions of key terms are set out in the Appendix.
INITIAL CONSIDERATIONS
5.
Rodents are routinely used in this test but other species may in some cases be appropriate if scientifically justified. Standard cytogenetic preparations of rodent testes generate mitotic (spermatogonia) and meiotic (spermatocyte) metaphases. Mitotic and meiotic metaphases are identified based on the morphology of the chromosomes (4). This in vivo cytogenetic test detects structural chromosomal aberrations in spermatogonial mitoses. Other target cells are not the subject of this test method.
6.
To detect chromatid-type aberrations in spermatogonial cells, the first mitotic cell division following treatment should be examined before these aberrations are converted into chromosome-type-aberrations in subsequent cell divisions. Additional information from treated spermatocytes can be obtained by meiotic chromosome analysis for chromosomal structural aberrations at diakinesis-metaphase I and metaphase II.
7.
A number of generations of spermatogonia are present in the testis (5), and these different germ cell types may have a spectrum of sensitivity to chemical treatment. Thus, the aberrations detected represent an aggregate response of treated spermatogonial cell populations. The majority of mitotic cells in testis preparations are B spermatogonia, which have a cell cycle of approximately 26 hr (3).
8.
If there is evidence that the test chemical, or its metabolite(s), will not reach the testis it is not appropriate to use this test.
PRINCIPLE OF THE TEST METHOD
9.
Generally, animals are exposed to the test chemical by an appropriate route of exposure and are euthanised at appropriate times after treatment. Prior to euthanasia, animals are treated with a metaphase-arresting agent (e.g. colchicine or Colcemid®). Chromosome preparations are then made from germ cells and stained, and metaphase cells are analysed for chromosome aberrations.
VERIFICATION OF LABORATORY PROFICIENCY
10.
Competency in this assay should be established by demonstrating the ability to reproduce expected results for structural chromosomal aberration frequencies in spermatogonia with positive control substances (including weak responses) such as those listed in Table 1 and obtaining negative control frequencies that are consistent with acceptable range of control data in the published literature (e.g. (2)(3)(6)(7)(8)(9)(10)) or with the laboratory’s historical control distribution, if available.
DESCRIPTION OF THE METHOD
Preparations
Selection of animal species
11.
Commonly used laboratory strains of healthy young adult animals should be employed. Male mice are commonly used; however, males of other appropriate mammalian species may be used when scientifically justified and to allow this test to be run in conjunction with another test method. The scientific justification for using species other than rodents should be provided in the report.
Animal Housing and feeding conditions
12.
For rodents, the temperature in the animal room should be 22 °C (± 3 °C). Although the relative humidity ideally should be 50-60 %, it should be at least 40 % and preferably not exceed 70 % other than during room cleaning. Lighting should be artificial, the sequence being 12 hours light, 12 hours dark. For feeding, conventional laboratory diets may be used with an unlimited supply of drinking water. The choice of diet may be influenced by the need to ensure a suitable admixture of a test chemical when administered by this route. Rodents should be housed in small groups (no more than five per cage) if no aggressive behaviour is expected, preferably in solid floor cages with appropriate environmental enrichment. Animals may be housed individually if scientifically justified.
Preparation of the animals
13.
Healthy young adult male animals (8-12 weeks old at start of treatment) are normally used, and are randomly assigned to the control and treatment groups. The individual animals are identified uniquely using a humane, minimally invasive method (e.g. by ringing, tagging, micro-chipping or biometric identification, but not ear or toe clipping) and acclimated to the laboratory conditions for at least five days. Cages should be arranged in such a way that possible effects due to cage placement are minimised. Cross contamination by the positive control and test chemical should be avoided. At the commencement of the study, the variation between individual animal weights should be minimal and not exceed ± 20 %.
Preparation of doses
14.
Solid test chemicals should be dissolved or suspended in appropriate solvents or vehicles or admixed in diet or drinking water prior to dosing of the animals. Liquid test chemicals may be dosed directly or diluted prior to dosing. For inhalation exposures, test chemicals can be administered as gas, vapour, or a solid/liquid aerosol, depending on their physicochemical properties. Fresh preparations of the test chemical should be employed unless stability data demonstrate the acceptability of storage and define the appropriate storage conditions.
Test conditions - Solvent/vehicle
15.
The solvent/vehicle should not produce toxic effects at the dose levels used, and should not be capable of chemical reaction with the test chemicals. If other than well-known solvents/vehicles are used, their inclusion should be supported with reference data indicating their compatibility. It is recommended that, wherever possible, the use of an aqueous solvent/vehicle should be considered first. Examples of commonly used compatible solvents/vehicles include water, physiological saline, methylcellulose solution, carboxymethyl cellulose sodium salt solution, olive oil and corn oil. In the absence of historical or published control data showing that no structural chromosomal aberrations and other deleterious effects are induced by a chosen atypical solvent/vehicle, an initial study should be conducted in order to establish the acceptability of the solvent/vehicle control.
Positive controls
16.
Concurrent positive control animals should always be used unless the laboratory has demonstrated proficiency in the conduct of the test and has used the test routinely in the recent past (e.g. within the last 5 years). When a concurrent positive control group is not included, scoring controls (fixed and unstained slides) should be included in each experiment. These can be obtained by including within the scoring of the study appropriate reference samples that have been obtained and stored from a separate positive control experiment conducted periodically (e.g. every 6-18 months) in the laboratory where the test is performed; for example, during proficiency testing and on a regular basis thereafter, where necessary.
17.
Positive control substances should reliably produce a detectable increase in the frequencies of cells with structural chromosomal aberrations over the spontaneous levels. Positive control doses should be chosen so that the effects are clear but do not immediately reveal the identity of the coded samples to the scorer. Examples of positive control substances are included in Table 1.
Negative control animals, treated with solvent or vehicle alone, and otherwise treated in the same way as the treatment groups, should be included for every sampling time. In the absence of historical or published control data showing that no chromosomal aberrations or other deleterious effects are induced by the chosen solvent/vehicle, untreated control animals also should be included for every sampling time in order to establish acceptability of the vehicle control.
PROCEDURE
Number of animals
19.
Group sizes at study initiation should be established with the aim of providing a minimum of 5 male animals per group. This number of animals per group is considered to be sufficient to provide adequate statistical power (i.e. generally able to detect at least a doubling in chromosomal aberration frequency when the negative control level is 1,0 % or greater with 80 % probability at a significance level of 0,05) (3) (11). As a guide to typical maximum animal requirements, a study at two sampling times with three dose groups and a concurrent negative control group, plus a positive control group (each composed of five animals per group), would require 45 animals.
Treatment schedule
20.
Test chemicals are usually administered once (i.e. as a single treatment); other dose regimens may be used, provided they are scientifically justified.
21.
In the highest dose group two sampling times after treatment are used. Since the time required for uptake and metabolism of the test chemical(s), as well as its effect on cell cycle kinetics, can affect the optimum time for chromosomal aberration detection, one early and one late sampling time approximately 24 and 48 hours after treatment are used. For doses other than the highest dose, an early sampling time of 24 hours (less than or equal to the cell cycle time of B spermatogonia and thus optimising the probability of scoring first post-treatment metaphases) after treatment should be taken, unless another sampling time is known to be more appropriate and justified.
22.
Other sampling times may be used. For example in the case of chemicals that exert S-independent effects, earlier sampling times (i.e. less than 24 hr) may be appropriate.
23.
A repeat dose treatment regimen can be used, such as in conjunction with a test on another endpoint that uses a 28 day administration period (e.g., TM B.58); however, additional animal groups would be required to accommodate different sampling times. Accordingly, the appropriateness of such a schedule needs to be justified scientifically on a case-by-case basis.
24.
Prior to euthanasia, animals are injected intraperitoneally with an appropriate dose of a metaphase arresting chemical (e.g. Colcemid® or colchicine). Animals are sampled at an appropriate interval thereafter. For mice and rats, this interval is approximately 3 - 5 hours.
Dose levels
25.
If a preliminary range-finding study is performed because there are no suitable data already available to aid in dose selection, it should be performed in the same laboratory, using the same species, strain, and treatment regimen to be used in the main study, according to recommendations for conducting dose range-finding studies (12). This study should aim to identify the maximum tolerated dose (MTD), defined as the dose inducing slight toxic effects relative to the duration of the study period (for example, abnormal behaviour or reactions, minor body weight depression or hematopoietic system cytotoxicity) but not death or evidence of pain, suffering or distress necessitating euthanasia of the animals (13).
26.
The highest dose may also be defined as a dose that produces some indication of toxicity in the spermatogonial cells (e.g. a reduction in the ratio of spermatogonial mitoses to first and second meiotic metaphases). This reduction should not exceed 50 %.
27.
Test chemicals with specific biological activities at low non-toxic doses (such as hormones and mitogens), and chemicals which exhibit saturation of toxicokinetic properties may be exceptions to the dose-setting criteria and should be evaluated on a case-by-case basis.
28.
In order to obtain dose response information, a complete study should include a negative control group (paragraph 18) and a minimum of three dose levels generally separated by a factor of 2, but by no greater than 4. If the test chemical does not produce toxicity in a range-finding study or based on existing data, the highest dose for a single administration should be 2 000 mg/kg body weight. However, if the test chemical does cause toxicity, the MTD should be the highest dose administered, and the dose levels used should preferably cover a range from the maximum to a dose producing little or no toxicity. When target tissue (i.e. testis) toxicity is observed at all dose levels tested, further study at non-toxic doses is advisable. Studies intending to more fully characterise the quantitative dose-response information may require additional dose groups. For certain types of test chemicals (e.g. human pharmaceuticals) covered by specific requirements, these limits may vary. If the test chemical does produce toxicity, the limit dose plus two lower doses (as described above) should be selected. The limit dose for an administration period of 14 days or more is 1 000 mg/kg body weight/day, and for administration periods of less than 14 days, the limit dose is 2 000 mg/kg/body weight/day.
Administration of doses
29.
The anticipated route of human exposure should be considered when designing an assay. Therefore, routes of exposure such as dietary, drinking water, topical subcutaneous, intravenous, oral (by gavage), inhalation, or implantation may be chosen as justified. In any case, the route should be chosen to ensure adequate exposure of the target tissue. Intraperitoneal injection is not normally recommended unless scientifically justified since it is not usually a physiologically relevant route of human exposure. If the test chemical is admixed in diet or drinking water, especially in case of single dosing, care should be taken that the delay between food and water consumption and sampling should be sufficient to allow detection of the effects (see paragraph 33). The maximum volume of liquid that can be administered by gavage or injection at one time depends on the size of the test animal. The volume should not normally exceed 1 ml/100g body weight except in the case of aqueous solutions where a maximum of 2 ml/100g body weight may be used. The use of volumes greater than this (if permitted by animal welfare legislation) should be justified. Variability in test volume should be minimised by adjusting the concentration to ensure a constant volume in relation to body weight at all dose levels.
Observations
30.
General clinical observations of the test animals should be made and clinical signs recorded at least once a day, preferably at the same time(s) each day and considering the peak period of anticipated effects after dosing. At least twice daily, all animals should be observed for morbidity and mortality. All animals should be weighed at study initiation, at least once a week during repeated-dose studies, and at euthanasia. In studies of at least one-week duration, measurements of food consumption should be made at least weekly. If the test chemical is administered via the drinking water, water consumption should be measured at each change of water and at least weekly. Animals exhibiting non-lethal indicators of excess toxicity should be euthanised prior to completion of the test period (13).
Chromosome preparation
31.
Immediately after euthanasia, germ cell suspensions are obtained from one, or both, testes, exposed to hypotonic solution and fixed following established protocols (e.g. (2) (14) (15). The cells are then spread on slides and stained (16) (17). All slides should be coded so that their identity is not available to the scorer.
Analysis
32.
At least 200 well spread metaphases should be scored for each animal (3) (11). If the historical negative control frequency is < 1 %, more than 200 cells/animal should be scored to increase the statistical power (3). Staining methods that permit the identification of the centromere should be used.
33.
Chromosome and chromatid-type aberrations should be recorded separately and classified by sub-types (breaks, exchanges). Gaps should be recorded, but not considered, when determining whether a chemical induces significant increases in the incidence of cells with chromosomal aberrations. Procedures in use in the laboratory should ensure that analysis of chromosomal aberrations is performed by well-trained scorers. Recognising that slide preparation procedures often result in the breakage of a proportion of metaphases with a resulting loss of chromosomes, the cells scored should, therefore, contain a number of centromeres not less than 2n±2, where n is the haploid number of chromosomes for that species.
34.
Although the purpose of the test is to detect structural chromosomal aberrations, it is important to record the frequencies of polyploid cells and cells with endoreduplicated chromosomes when these events are seen (see Paragraph 44).
DATA AND REPORTING
Treatment of results
35.
Individual animal data should be presented in tabular form. For each animal the number of cells with structural chromosomal aberration(s) and the number of chromosome aberrations per cell should be evaluated. Chromatid- and chromosome-type aberrations classified by sub-types (breaks, exchanges) should be listed separately with their numbers and frequencies for experimental and control groups. Gaps are recorded separately. The frequency of gaps is reported but generally not included in the analysis of the total structural chromosomal aberration frequency. Percentage of polyploidy and cells with endoreduplicated chromosomes are reported when seen.
36.
Data on toxicity and clinical signs (as per Paragraph 30) should be reported.
Acceptability Criteria
37.
The following criteria determine the acceptability of a test.
—
Concurrent negative control is consistent with published norms for historical negative control data, which are generally expected to be > 0 % and ≤ 1,5 % cells with chromosomal aberrations, and the laboratory's historical control data if available (see Paragraphs 10 and 18).
—
Concurrent positive controls induce responses that are consistent with published norms for historical positive control data, or the laboratory’s historical positive control database, if available, and produce a statistically significant increase compared with the negative control (see Paragraphs 17, 18).
—
Adequate numbers of cells and doses have been analysed (see Paragraphs 28 and 32).
—
The criteria for the selection of top dose are consistent with those described in Paragraphs 25, and 26.
38.
If both mitosis and meiosis are observed, the ratio of spermatogonial mitoses to first and second meiotic metaphases should be determined as a measure of cytotoxicity for all treated and negative control animals in a total sample of 100 dividing cells per animal. If only mitosis is observed, the mitotic index should be determined in at least 1 000 cells for each animal.
Evaluation and interpretation of results
39.
At least three treated dose groups should be analysed in order to provide sufficient data for dose-response analysis.
40.
Providing that all acceptability criteria are fulfilled, a test chemical is considered a clear positive if:
—
at least one of the test doses exhibits a statistically significant increase compared with the concurrent negative control;
—
the increase is dose-related at least at one sampling time; and,
—
any of the results are outside acceptable range of negative control data, or the distribution of the laboratory’s historical negative control data (e.g. Poisson-based 95 % control limit) if available.
The test chemical is then considered able to induce chromosomal aberrations in spermatogonial cells of the test animals. Recommendations for the most appropriate statistical methods can also be found in the literature (11) (18). Statistical tests used should consider the animal as the experimental unit.
41.
Providing that all acceptability criteria are fulfilled, a test chemical is considered a clear negative if:
—
none of the test doses exhibits a statistically significant increase compared with the concurrent negative control;
—
there is no dose-related increase in any experimental condition; and,
—
all results are within acceptable range of negative control data, or the laboratory’s historical negative control data (e.g. Poisson-based 95 % control limit), if available.
The test chemical is then considered unable to induce chromosomal aberrations in the spermatogonial cells of the test animals. Recommendations for the most appropriate statistical methods can also be found in the literature (11) (18). A negative result does not exclude the possibility that the chemical may induce chromosomal aberrations at later developmental phases not studied, or gene mutations.
42.
There is no requirement for verification of a clear positive or clear negative response.
43.
If the response is not clearly negative or positive, and in order to assist in establishing the biological relevance of a result (e.g. a weak or borderline increase), the data should be evaluated by expert judgment and/or further investigations using the existing experimental data, such as consideration whether the positive result is outside the acceptable range of negative control data, or the laboratory's historical negative control data (19).
44.
In rare cases, even after further investigations, the data set will preclude making a conclusion of positive or negative results, and will therefore be concluded as equivocal.
45.
An increase in the number of polyploid cells may indicate that the test chemical has the potential to inhibit mitotic processes and to induce numerical chromosomal aberrations (20). An increase in the number of cells with endoreduplicated chromosomes may indicate that the test chemical has the potential to inhibit cell cycle progress (21) (22), which is a different mechanism of inducing numerical chromosome changes than inhibition of mitotic processes (see Paragraph 2). Therefore incidence of polyploid cells and cells with endoreduplicated chromosomes should be recorded separately.
Test report
46.
The test report should include the following information:
Summary.
Test chemical:
—
source, lot number, limit date for use, if available;
—
stability of the test chemical itself, if known;
—
solubility and stability of the test chemical in solvent, if known;
—
measurement of pH, osmolality, and precipitate in the culture medium to which the test chemical was added, as appropriate.
Mono-constituents substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents.
Test chemical preparation:
—
justification for choice of vehicle;
—
solubility and stability of the test chemical in solvent/vehicle.
—
preparation of dietary, drinking water or inhalation formulations;
—
analytical determinations on formulations (e.g. stability, homogeneity, nominal concentrations when conducted.
Test animals:
—
species/strain used and justification for use;
—
number and age of animals;
—
source, housing conditions, diet, etc.;
—
method for uniquely identifying the animals
—
for short-term studies: individual weight of the animals at the start and end of the test; for studies longer than one week: individual body weights during the study and food consumption. Body weight range, mean and standard deviation for each group should be included.
Test conditions:
—
positive and negative (vehicle/solvent) control data;
—
data from range finding study, if conducted;
—
rationale for dose level selection;
—
rationale for route of administration;
—
details of test chemical preparation;
—
details of the administration of the test chemical;
—
rationale for sacrifice times;
—
methods for measurement of animal toxicity, including, where available, histopathological or hematological analyses and the frequency with which animal observations and body weights were taken;
—
methods for verifying that the test chemical reached the target tissue, or general circulation, if negative results are obtained;
—
actual dose (mg/kg body weight/day) calculated from diet/drinking water test chemical concentration (ppm) and consumption, if applicable;
—
details of food and water quality;
—
detailed description of treatment and sampling schedules and justifications for the choices;
—
method of euthanasia;
—
method of analgesia (where used)
—
procedures for isolating tissues;
—
identity of metaphase arresting chemical, its concentration and duration of treatment;
—
methods of slide preparation;
—
criteria for scoring aberrations;
—
number of cells analysed per animal;
—
criteria for considering studies as positive, negative or equivocal.
Results:
—
animal condition prior to and throughout the test period, including signs of toxicity;
—
body and organ weights at sacrifice (if multiple treatments are employed, body weights taken during the treatment regimen);
—
signs of toxicity;
—
mitotic index;
—
ratio of spermatogonial mitoses cells to first and second meiotic metaphases, or other evidence of exposure to the target tissue;
—
type and number of aberrations, given separately for each animal;
—
total number of aberrations per group with means and standard deviations;
—
number of cells with aberrations per group with means and standard deviations;
—
dose-response relationship, where possible;
—
statistical analyses and methods applied;
—
concurrent negative control data;
—
historical negative control data with ranges, means, standard deviations, and 95 % confidence interval (where available), or published historical negative control data used for acceptability of the test results;
—
concurrent positive control data;
—
changes in ploidy, if seen, including frequencies of polyploidy and/or endoreduplicated cells.
Discussion of the results
Conclusion
LITERATURE
(1)
OECD (2016). Overview of the set of OECD Genetic Toxicology Test Guidelines and updates performed in 2014-2015. ENV Publications. Series on Testing and Assessment, No 234, OECD, Paris
(2)
Adler, I.-D. (1984). Cytogenetic Tests in Mammals. In: Mutagenicity Testing: a Practical Approach. Ed. S. Venitt and J. M. Parry. IRL Press, Oxford, Washington DC, pp. 275-306.
(3)
Adler I.-D., Shelby M. D., Bootman, J., Favor, J., Generoso, W., Pacchierotti, F., Shibuya, T. and Tanaka N. (1994). International Workshop on Standardisation of Genotoxicity Test Procedures. Summary Report of the Working Group on Mammalian Germ Cell Tests. Mutation Res., 312, 313-318.
(4)
Russo, A. (2000). In Vivo Cytogenetics: Mammalian Germ Cells. Mutation Res., 455, 167-189.
(5)
Hess, R.A. and de Franca L.R. (2008). Spermatogenesis and Cycle of the Seminiferous Epithelium. In: Molecular Mechanisms in Spermatogenesis, Cheng C.Y. (Ed.) Landes Bioscience and Springer Science+Business Media, pp. 1-15.
(6)
Adler, I.-D. (1974). Comparative Cytogenetic Study after Treatment of Mouse Spermatogonia with Mitomycin C, Mutation. Res., 23(3): 368-379.Adler, I.D. (1986). Clastogenic Potential in Mouse Spermatogonia of Chemical Mutagens Related to their Cell-Cycle Specifications. In: Genetic Toxicology of Environmental Chemicals, Part B: Genetic Effects and Applied Mutagenesis, Ramel C., Lambert B. and Magnusson J. (Eds.) Liss, New York, pp. 477-484.
(7)
Cattanach, B.M., and Pollard C.E. (1971). Mutagenicity Tests with Cyclohexylamine in the Mouse, Mutation Res., 12, 472-474.
(8)
Cattanach, B.M., and Williams, C.E. (1971). A search for Chromosome Aberrations Induced in Mouse Spermatogonia by Chemical Mutagens, Mutation Res., 13, 371-375.
(9)
Rathenburg, R. (1975). Cytogenetic Effects of Cyclophosphamide on Mouse Spermatogonia, Humangenetik 29, 135-140.
(10)
Shiraishi, Y. (1978). Chromosome Aberrations Induced by Monomeric Acrylamide in Bone Marrow and Germ Cells of Mice, Mutation Res., 57(3): 313–324.
(11)
Adler I-D., Bootman, J., Favor, J., Hook, G., Schriever-Schwemmer, G., Welzl, G., Whorton, E., Yoshimura, I. and Hayashi, M. (1998). Recommendations for Statistical Designs of In Vivo Mutagenicity Tests with Regard to Subsequent Statistical Analysis, Mutation Res., 417, 19–30.
(12)
Fielder, R. J., Allen, J. A., Boobis, A. R., Botham, P. A., Doe, J., Esdaile, D. J., Gatehouse, D. G., Hodson-Walker, G., Morton, D. B., Kirkland, D. J. and Richold, M. (1992). Report of British Toxicology Society/UK Environmental Mutagen Society Working Group: Dose setting in In Vivo Mutagenicity Assays. Mutagenesis, 7, 313-319.
(13)
OECD (2000). Guidance Document on the Recognition, Assessment and Use of Clinical Signs as Humane Endpoints for Experimental Animals Used in Safety Evaluation, Series on Testing and Assessment, (No 19.), Organisation for Economic Cooperation and Development, Paris.
(14)
Yamamoto, K. and Kikuchi, Y. (1978). A New Method for Preparation of Mammalian Spermatogonial Chromosomes. Mutation Res., 52, 207-209.
(15)
Hsu, T.C., Elder, F. and Pathak, S. (1979). Method for Improving the Yield of Spermatogonial and Meiotic Metaphases in Mammalian Testicular Preparations. Environ. Mutagen., 1, 291-294.
(16)
Evans, E.P., Breckon, G., and Ford, C.E. (1964). An Air-Drying Method for Meiotic Preparations from Mammalian Testes. Cytogenetics and Cell Genetics, 3, 289-294.
(17)
Richold, M., Ashby, J., Bootman, J., Chandley, A., Gatehouse, D.G. and Henderson, L. (1990). In Vivo Cytogenetics Assays, In: D.J. Kirkland (Ed.) Basic Mutagenicity Tests, UKEMS Recommended Procedures. UKEMS Subcommittee on Guidelines for Mutagenicity Testing. Report. Part I revised. Cambridge University Press, Cambridge, New York, Port Chester, Melbourne, Sydney, pp. 115-141.
(18)
Lovell, D.P., Anderson, D., Albanese, R., Amphlett, G.E., Clare, G., Ferguson, R., Richold, M., Papworth, D.G.and Savage, J.R.K. (1989). Statistical Analysis of In Vivo Cytogenetic Assays In: D.J. Kirkland (Ed.) Statistical Evaluation of Mutagenicity Test Data. UKEMS Sub-Committee on Guidelines for Mutagenicity Testing, Report, Part III. Cambridge University Press, Cambridge, New York, Port Chester, Melbourne, Sydney, pp. 184-232.
(19)
Hayashi, M., Dearfield, K., Kasper, P., Lovell, D., Martus, H.-J. and Thybaud, V. (2011). Compilation and Use of Genetic Toxicity Historical Control Data. Mutation Res., 723, 87-90.
(20)
Warr T.J., Parry E.M. and Parry J.M. (1993). A Comparison of Two In Vitro Mammalian Cell Cytogenetic Assays for the Detection of Mitotic Aneuploidy Using 10 Known or Suspected Aneugens, Mutation Res., 287, 29-46.
(21)
Huang, Y., Change, C. and Trosko, J.E. (1983). Aphidicolin-Induced Endoreduplication in Chinese Hamster Cells. Cancer Res., 43, 1362-1364.
(22)
Locke-Huhle, C. (1983). Endoreduplication in Chinese Hamster Cells during Alpha-Radiation Induced G2 Arrest. Mutation Res., 119, 403-413.
Appendix
DEFINITIONS
Aneuploidy: any deviation from the normal diploid (or haploid) number of chromosomes by a single chromosome or more than one, but not by entire set(s) of chromosomes (polyploidy).
Centromere: Region(s) of a chromosome with which spindle fibers are associated during cell division, allowing orderly movement of daughter chromosomes to the poles of the daughter cells.
Chemical: A substance or a mixture
Chromosome diversity: diversity of chromosome shapes (e.g. metacentrique, acrocentriques, etc) and sizes.
Chromatid-type aberration: structural chromosome damage expressed as breakage of single chromatids or breakage and reunion between chromatids.
Chromosome-type aberration: structural chromosome damage expressed as breakage, or breakage and reunion, of both chromatids at an identical site.
Clastogen: any chemical which causes structural chromosomal aberrations in populations of cells or organisms.
Gap: an achromatic lesion smaller than the width of one chromatid, and with minimum misalignment of the chromatids.
Genotoxic: a general term encompassing all types of DNA or chromosome damage, including breaks, deletions, adducts, nucleotides modifications and linkages, rearrangements, mutations, chromosome aberrations, and aneuploidy. Not all types of genotoxic effects result in mutations or stable chromosome damage.
Mitotic index (MI): the ratio of cells in metaphase divided by the total number of cells observed in a population of cells; an indication of the degree of proliferation of that population.
Mitosis: division of the cell nucleus usually divided into prophase, prometaphase, metaphase, anaphase, and telophase.
Mutagenic: produces a heritable change of DNA base-pair sequence(s) in genes or of the structure of chromosomes (chromosome aberrations).
Numerical abnormality: a change in the number of chromosomes from the normal number characteristic of the animals utilised.
Polyploidy: a multiple of the haploid chromosome number (n) other than the diploid number (i.e., 3n, 4n and so on).
Structural aberration: a change in chromosome structure detectable by microscopic examination of the metaphase stage of cell division, observed as deletions and fragments, exchanges.
Test chemical: Any substance or mixture tested using this test method.
UVCB: Chemical Substances of Unknown or Variable Composition, Complex Reaction Products and Biological Materials
"
(5)
In Part B, Chapter B.40 is replaced by the following:
"B.40 IN VITRO SKIN CORROSION: TRANSCUTANEOUS ELECTRICAL RESISTANCE TEST METHOD (TER)
INTRODUCTION
1.
This test method (TM) is equivalent to OECD test guideline (TG) 430 (2015). Skin corrosion refers to the production of irreversible damage to the skin manifested as visible necrosis through the epidermis and into the dermis, following the application of a test chemical [as defined by the United Nations (UN) Globally Harmonized System of Classification and Labelling of Chemicals (GHS) (1) and the European Union (EU) Regulation 1272/2008 on Classification, Labelling and Packaging of Substances and Mixtures (CLP) (1)]. This updated test method B.40 provides an in vitro procedure allowing the identification of non-corrosive and corrosive substances and mixtures in accordance with UN GHS (1) and CLP
2.
The assessment of skin corrosivity has typically involved the use of laboratory animals (TM B.4, equivalent to OECD TG 404 originally adopted in 1981, and revised in 1992, 2002 and 2015) (2). In addition to the present TM B.40, other in vitro test methods for testing of skin corrosion potential of chemicals have been validated and adopted as TM B.40bis (equivalent to OECD TG 431) (3) and TM B.65 (equivalent to OECD TG 435) (4), that are also able to identify sub-categories of corrosive chemicals when required. Several validated in vitro test methods have been adopted as TM B.46 (equivalent to OECD TG 439 (5), to be used for the testing of skin irritation. An OECD guidance document on Integrated Approaches to Testing and Assessment (IATA) for Skin Corrosion and Irritation describes several modules which group various information sources and analysis tools and provides guidance on (i) how to integrate and use existing testing and non-testing data for the assessment of skin irritation and skin corrosion potentials of chemicals and (ii) proposes an approach when further testing is needed (6).
3.
This test method addresses the human health endpoint skin corrosion. It is based on the rat skin transcutaneous electrical resistance (TER) test method, which utilises skin discs to identify corrosives by their ability to produce a loss of normal stratum corneum integrity and barrier function. The corresponding OECD test guideline was originally adopted in 2004 and updated in 2015 to refer to the IATA guidance document.
4.
In order to evaluate in vitro skin corrosion testing for regulatory purposes, pre-validation studies (7) followed by a formal validation study of the rat skin TER test method for assessing skin corrosion were conducted (8) (9) (10) (11). The outcome of these studies led to the recommendation that the TER test method (designated the Validated Reference Method – VRM) could be used for regulatory purposes for the assessment of in vivo skin corrosivity (12) (13) (14).
5.
Before a proposed similar or modified in vitro TER test method for skin corrosion other than the VRM can be used for regulatory purposes, its reliability, relevance (accuracy), and limitations for its proposed use should be determined to ensure its similarity to the VRM, in accordance with the requirements of the Performance Standards (PS) (15). OECD Mutual Acceptance of Data will only be guaranteed after any proposed new or updated test method following the PS have been reviewed and included in the corresponding OECD test guideline.
DEFINITIONS
6.
Definitions used are provided in the Appendix.
INITIAL CONSIDERATIONS
7.
A validation study (10) and other published studies (16) (17) have reported that the rat skin TER test method is able to discriminate between known skin corrosives and non-corrosives with an overall sensitivity of 94 % (51/54) and specificity of 71 % (48/68) for a database of 122 substances.
8.
This test method addresses in vitro skin corrosion. It allows the identification of non-corrosive and corrosive test chemicals in accordance with the UN GHS/CLP. A limitation of this test method, as demonstrated by the validation studies (8) (9) (10) (11), is that it does not allow the sub-categorisation of corrosive substances and mixtures in accordance with the UN GHS/ CLP. The applicable regulatory framework will determine how this test method will be used. While this test method does not provide adequate information on skin irritation, it should be noted that TM B.46 specifically addresses the health effect skin irritation in vitro (5). For a full evaluation of local skin effects after a single dermal exposure, the OECD Guidance Document on IATA should be consulted (6).
9.
A wide range of chemicals representing mainly substances has been tested in the validation underlying this test method and the empirical database of the validation study amounted to 60 substances covering a wide range of chemical classes (8) (9). On the basis of the overall data available, the test method is applicable to a wide range of chemical classes and physical states including liquids, semi-solids, solids and waxes. However, since for specific physical states test items with suitable reference data are not readily available, it should be noted that a comparably small number of waxes and corrosive solids were assessed during validation. The liquids may be aqueous or non-aqueous; solids may be soluble or insoluble in water. In cases where evidence can be demonstrated on the non-applicability of the test method to a specific category of substances, the test method should not be used for that specific category of substances. In addition, this test method is assumed to be applicable to mixtures as an extension of its applicability to substances. However, due to the fact that mixtures cover a wide spectrum of categories and composition, and that only limited information is currently available on the testing of mixtures, in cases where evidence can be demonstrated on the non-applicability of the test method to a specific category of mixtures (e.g. following a strategy as proposed by Eskes et al., 2012) (18), the test method should not be used for that specific category of mixtures. Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture. Gases and aerosols have not been assessed yet in validation studies (8) (9). While it is conceivable that these can be tested using the TER test method, the current test method does not allow testing of gases and aerosols.
PRINCIPLE OF THE TEST
10.
The test chemical is applied for up to 24 hours to the epidermal surfaces of skin discs in a two-compartment test system in which the skin discs function as the separation between the compartments. The skin discs are taken from humanely killed rats aged 28-30 days. Corrosive chemicals are identified by their ability to produce a loss of normal stratum corneum integrity and barrier function, which is measured as a reduction in the TER below a threshold level (16) (see paragraph 32). For rat skin TER, a cut-off value of 5k has been selected based on extensive data for a wide range of substances where the vast majority of values were either clearly well above (often > 10 k), or well below (often < 3 k) this value (16). Generally, test chemicals that are non-corrosive in animals but are irritant or non-irritant do not reduce the TER below this cut-off value. Furthermore, use of other skin preparations or other equipment may alter the cut-off value, necessitating further validation.
11.
A dye-binding step is incorporated into the test procedure for confirmation testing of positive results in the TER including values around 5 k. The dye-binding step determines if the increase in ionic permeability is due to physical destruction of the stratum corneum. The TER method utilising rat skin has shown to be predictive of in vivo corrosivity in the rabbit assessed under TM B.4 (2).
DEMONSTRATION OF PROFICIENCY
12.
Prior to routine use of the rat skin TER test method that adheres to this test method, laboratories should demonstrate technical proficiency by correctly classifying the twelve Proficiency Substances recommended in Table 1. In situations where a listed substance is unavailable or where justifiable, another substance for which adequate in vivo and in vitro reference data are available may be used (e.g. from the list of reference chemicals (16)) provided that the same selection criteria as described in Table 1 is applied.
Abbreviations: aq = aqueous; CASRN = Chemical Abstracts Service Registry Number; VRM = Validated Reference Method; C = corrosive; NC = not corrosive.
PROCEDURE
13.
Standard Operating Procedures (SOP) for the rat skin TER skin corrosion test method are available (19). The rat skin TER test methods covered by this test method should comply with the following conditions:
Animals
14.
Rats should be used because the sensitivity of their skin to substances in this test method has been previously demonstrated (12) and is the only skin source that has been formally validated (8) (9). The age (when the skin is collected) and strain of the rat is particularly important to ensure that the hair follicles are in the dormant phase before adult hair growth begins.
15.
The dorsal and flank hair from young, approximately 22 day-old, male or female rats (Wistar-derived or a comparable strain), is carefully removed with small clippers. Then, the animals are washed by careful wiping, whilst submerging the clipped area in antibiotic solution (containing, for example, streptomycin, penicillin, chloramphenicol, and amphotericin, at concentrations effective in inhibiting bacterial growth). Animals are washed with antibiotics again on the third or fourth day after the first wash and are used within 3 days of the second wash, when the stratum corneum has recovered from the hair removal.
Preparation of the skin discs
16.
Animals are humanely killed when 28-30 days old; this age is critical. The dorso-lateral skin of each animal is then removed and stripped of excess subcutaneous fat by carefully peeling it away from the skin. Skin discs, with a diameter of approximately 20-mm each, are removed. The skin may be stored before discs are used where it is shown that positive and negative control data are equivalent to that obtained with fresh skin.
17.
Each skin disc is placed over one of the ends of a PTFE (polytetrafluoroethylene) tube, ensuring that the epidermal surface is in contact with the tube. A rubber ‘O’ ring is press-fitted over the end of the tube to hold the skin in place and excess tissue is trimmed away. The rubber ‘O’ ring is then carefully sealed to the end of the PTFE tube with petroleum jelly. The tube is supported by a spring clip inside a receptor chamber containing MgSO4 solution (154 mM) (Figure 1). The skin disc should be fully submerged in the MgSO4 solution. As many as 10-15 skin discs can be obtained from a single rat skin. Tube and ‘O’ ring dimensions are shown in Figure 2.
18.
Before testing begins, the TER of two skin discs are measured as a quality control procedure for each animal skin. Both discs should give electrical resistance values greater than 10 k for the remainder of the discs to be used for the test method. If the resistance value is less than 10 k, the remaining discs from that skin should be discarded.
Application of the test chemical and control substances
19.
Concurrent positive and negative controls should be used for each run (experiment) to ensure adequate performance of the experimental model. Skin discs from a single animal should be used in each run (experiment). The suggested positive and negative control test chemicals are 10 M hydrochloric acid and distilled water, respectively.
20.
Liquid test chemicals (150 μl) are applied uniformly to the epidermal surface inside the tube. When testing solid materials, a sufficient amount of the solid is applied evenly to the disc to ensure that the whole surface of the epidermis is covered. Deionised water (150 μl) is added on top of the solid and the tube is gently agitated. In order to achieve maximum contact with the skin, solids may need to be warmed to 30° C to melt or soften the test chemical, or ground to produce a granular material or powder.
21.
Three skin discs are used for each test and control chemical in each testing run (experiment). Test chemicals are applied for 24 hours at 20-23° C. The test chemical is removed by washing with a jet of tap water at up to room temperature until no further material can be removed.
TER measurements
22.
The skin impedance is measured as TER by using a low-voltage, alternating current Wheatstone bridge (18). General specifications of the bridge are 1-3 Volt operating voltage, a sinus or rectangular shaped alternating current of 50 - 1 000 Hz, and a measuring range of at least 0,1-30 k. The databridge used in the validation study measured inductance, capacitance and resistance up to values of 2 000H, 2 000 F, and 2 M, respectively at frequencies of 100Hz or 1kHz, using series or parallel values. For the purposes of the TER corrosivity assay measurements are recorded in resistance, at a frequency of 100 Hz and using series values. Prior to measuring the electrical resistance, the surface tension of the skin is reduced by adding a sufficient volume of 70 % ethanol to cover the epidermis. After a few seconds, the ethanol is removed from the tube and the tissue is then hydrated by the addition of 3 ml MgSO4 solution (154mM). The databridge electrodes are placed on either side of the skin disc to measure the resistance in kΩ/skin disc (Figure 1). Electrode dimensions and the length of the electrode exposed below the crocodile clips are shown in Figure 2. The clip attached to the inner electrode is rested on the top of the PTFE tube during resistance measurement to ensure that a consistent length of electrode is submerged in the MgSO4 solution. The outer electrode is positioned inside the receptor chamber so that it rests on the bottom of the chamber. The distance between the spring clip and the bottom of the PTFE tube is maintained as a constant (Figure 2), because this distance affects the resistance value obtained. Consequently, the distance between the inner electrode and the skin disc should be constant and minimal (1-2 mm).
23.
If the measured resistance value is greater than 20 k, this may be due to the remains of the test chemical coating the epidermal surface of the skin disc. Further removal of this coating can be attempted, for example, by sealing the PTFE tube with a gloved thumb and shaking it for approximately 10 seconds; the MgSO4 solution is discarded and the resistance measurement is repeated with fresh MgSO4.
24.
The properties and dimensions of the test apparatus and the experimental procedure used may influence the TER values obtained. The 5 k corrosive threshold was developed from data obtained with the specific apparatus and procedure described in this test method. Different threshold and control values may apply if the test conditions are altered or a different apparatus is used. Therefore, it is necessary to calibrate the methodology and resistance threshold values by testing a series of Proficiency Substances chosen from the substances used in the validation study (8) (9), or from similar chemical classes to the substances being investigated. A set of suitable Proficiency Substances is identified in Table 1.
Dye Binding Methods
25.
Exposure of certain non-corrosive materials can result in a reduction of resistance below the cut-off of 5 kΩ allowing the passage of ions through the stratum corneum, thereby reducing the electrical resistance (9). For example, neutral organics and substances that have surface-active properties (including detergents, emulsifiers and other surfactants) can remove skin lipids making the barrier more permeable to ions. Thus, if TER values produced by such chemicals are less than or around 5 kΩ in the absence of visually perceptible damage of the skin discs, an assessment of dye penetration should be carried out on the control and treated tissues to determine if the TER values obtained were the result of increased skin permeability, or skin corrosion (7) (9). In case of the latter where the stratum corneum is disrupted, the dye sulforhodamine B, when applied to the skin surface rapidly penetrates and stains the underlying tissue. This particular dye is stable to a wide range of substances and is not affected by the extraction procedure described below.
Sulforhodamine B dye application and removal
26.
Following TER assessment, the magnesium sulphate is discarded from the tube and the skin is carefully examined for obvious damage. If there is no obvious major damage (e.g. perforation), 150 l of a 10 % (w/v) dilution in distilled water of the dye sulforhodamine B (Acid Red 52; C.I. 45100; CAS number 3520-42-1), is applied to the epidermal surface of each skin disc for 2 hours. These skin discs are then washed with tap water at up to room temperature for approximately 10 seconds to remove any excess/unbound dye. Each skin disc is carefully removed from the PTFE tube and placed in a vial (e.g. a 20-ml glass scintillation vial) containing deionised water (8 ml). The vials are agitated gently for 5 minutes to remove any additional unbounddye. This rinsing procedure is then repeated, after which the skin discs are removed and placed into vials containing 5ml of 30 % (w/v) sodium dodecyl sulphate (SDS) in distilled water and are incubated overnight at 60° C.
27.
After incubation, each skin disc is removed and discarded and the remaining solution is centrifuged for 8 minutes at 21° C (relative centrifugal force ~175 × g). A 1ml sample of the supernatant is diluted 1 in 5 (v/v) [i.e. 1ml + 4ml] with 30 % (w/v) SDS in distilled water. The optical density (OD) of the solution is measured at 565 nm.
Calculation of dye content
28.
The sulforhodamine B dye content per disc is calculated from the OD values (9) (sulforhodamine B dye molar extinction coefficient at 565nm = 8,7 × l04; molecular weight = 580). The dye content is determined for each skin disc by the use of an appropriate calibration curve and mean dye content is then calculated for the replicates.
Acceptability Criteria
29.
The mean TER results are accepted if the concurrent positive and negative control values fall within the acceptable ranges for the method in the testing laboratory. The acceptable resistance ranges for the methodology and apparatus described above are given in the following table:
Control
Substance
Resistance range (k)
Positive
10 M Hydrochloric acid
0,5 – 1,0
Negative
Distilled water
10 – 25
30.
The mean dye binding results are accepted on condition that concurrent control values fall within the acceptable ranges for the method. Suggested acceptable dye content ranges for the control substances for the methodology and apparatus described above are given in the following table:
Control
Substance
Dye content range (g/disc)
Positive
10 M Hydrochloric acid
40 – 100
Negative
Distilled water
15 – 35
Interpretation of results
31.
The cut-off TER value distinguishing corrosive from non-corrosive test chemicals was established during test method optimisation, tested during a pre-validation phase, and confirmed in a formal validation study.
32.
The prediction model for rat skin TER skin corrosion test method (9) (19), associated with the UN GHS/CLP classification system, is given below:
The test chemical is considered to be non-corrosive to skin:
i)
if the mean TER value obtained for the test chemical is greater than (>) 5 kΩ, or
ii)
the mean TER value obtained for the test chemical is less than or equal to (≤) 5 kΩ, and
—
the skin discs show no obvious damage(e.g. perforation), and
—
the mean disc dye content is less than (<) the mean disc dye content of the 10 M HCl positive control obtained concurrently (see paragraph 30 for positive control values).
The test chemical is considered to be corrosive to skin:
i)
if the mean TER value obtained for the test chemical is less than or equal to (≤) 5 kΩ and the skin discs are obviously damaged(e.g. perforated), or
ii)
the mean TER value obtained for the test chemical is less than or equal to (≤) 5 kΩ, and
—
the skin discs show no obvious damage(e.g. perforation), but
—
the mean disc dye content is greater than or equal to (≥) the mean disc dye content of the 10 M HCl positive control obtained concurrently (see paragraph 30 for positive control values).
33.
A testing run (experiment) composed of at least three replicate skin discs should be sufficient for a test chemical when the classification is unequivocal. However, in cases of borderline results, such as non-concordant replicate measurements and/or mean TER equal to 5 ± 0.5 kΩ, a second independent testing run (experiment) should be considered, as well as a third one in case of discordant results between the first two testing runs (experiments).
DATA AND REPORTING
Data
34.
Resistance values (kΩ) and dye content values (μg/disc), where appropriate, for the test chemical, as well as for positive and negative controls should be reported in tabular form, including data for each individual replicate disc in each testing run (experiment) and mean values ± SD. All repeat experiments should be reported. Observed damage in the skin discs should be reported for each test chemical.
Test report
35.
The test report should include the following information:
Test Chemical and Control Substances:
—
Mono-constituent substance: chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc;
—
Multi-constituent substance, UVCB and mixture: characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physico-chemical properties of the constituents;
—
Physical appearance, water solubility, and additional relevant physico-chemical properties;
—
Source, lot number if available;
—
Treatment of the test chemical/control substance prior to testing, if applicable (e.g. warming, grinding);
—
Stability of the test chemical, limit date for use, or date for re-analysis if known;
—
Storage conditions.
Test Animals:
—
Strain and sex used;
—
Age of the animals when used as donor animals;
—
Source, housing condition, diet, etc.;
—
Details of the skin preparation.
Test Conditions:
—
Calibration curves for test apparatus;
—
Calibration curves for dye binding test performance, band pass used for measuring OD values, and OD linearity range of measuring device (e.g. spectrophotometer), if appropriate;
—
Details of the test procedure used for TER measurements;
—
Details of the test procedure used for the dye binding assessment, if appropriate;
—
Test doses used, duration of exposure period(s) and temperature(s) of exposure;
—
Details on washing procedure used after the exposure period;
—
Number of replicate skin discs used per test chemical and controls (positive and negative control);
—
Description of any modification of the test procedure;
—
Reference to historical data of the model. This should include, but is not limited to:
i)
Acceptability of the positive and negative control TER values (in kΩ) with reference to positive and negative control resistance ranges
ii)
Acceptability of the positive and negative control dye content values (in μg/disc) with reference to positive and negative control dye content ranges
iii)
Acceptability of the test results with reference to historical variability between skin disc replicates
—
Description of decision criteria/prediction model applied.
Results:
—
Tabulation of data from the TER and dye binding assays (if appropriate) for individual test chemicals and controls, for each testing run (experiment) and each skin disc replicate (individual animals and individual skin samples), means, SDs and CVs;
—
Description of any effects observed;
—
The derived classification with reference to the prediction model/decision criteria used.
Discussion of the results
Conclusions
LITERATURE
(1)
United Nations (UN) (2013). Globally Harmonized System of Classification and Labelling of Chemicals (GHS), Second Revised Edition, UN New York and Geneva, 2013. Available at: [http://www.unece.org/trans/danger/publi/ghs/ghs_rev05/05files_e.html].
(2)
Chapter B.4 of this Annex, Acute Dermal Irritation, Corrosion.
(3)
Chapter B.40bis of this Annex, In Vitro Skin Model.
(4)
Chapter B.65 of this Annex, In Vitro Membrane Barrier Test Method.
(5)
Chapter B.46 of this Annex, In Vitro Skin Irritation: Reconstructed Human Epidermis Test Method.
(6)
OECD (2014). Guidance document on Integrated Approaches to Testing and Assessment for Skin Irritation/Corrosion. Environment, Health and Safety Publications, Series on Testing and Assessment, (No 203), Organisation for Economic Cooperation and Development, Paris.
(7)
Botham P.A., Chamberlain M., Barratt M.D., Curren R.D., Esdaile D.J., Gardner J.R., Gordon V.C., Hildebrand B., Lewis R.W., Liebsch M., Logemann P., Osborne R., Ponec M., Regnier J.F., Steiling W., Walker A.P., and Balls M. (1995). A Prevalidation Study on In Vitro Skin Corrosivity Testing. The Report and Recommendations of ECVAM Workshop 6.ATLA 23, 219-255.
(8)
Barratt M.D., Brantom P.G., Fentem J.H., Gerner I., Walker A.P., and Worth A.P. (1998). The ECVAM International Validation Study on In Vitro Tests for Skin Corrosivity. 1. Selection and Distribution of the Test Chemicals. Toxic.In Vitro 12, 471-482.
(9)
Fentem J.H., Archer G.E.B., Balls M., Botham P.A., Curren R.D., Earl L.K., Esdaile D.J., Holzhütter H.-G., and Liebsch M. (1998). The ECVAM International Validation Study on In Vitro Tests For Skin Corrosivity. 2. Results and Evaluation by the Management Team. Toxic.In Vitro12, 483- 524.
(10)
Balls M., Blaauboer B.J., Fentem J.H., Bruner L., Combes R.D., Ekwall B., Fielder R.J., Guillouzo A., Lewis R.W., Lovell D.P., Reinhardt C.A., Repetto G., Sladowski D., Spielmann H., and Zucco F. (1995). Practical Aspects of the Validation of Toxicity Test Procedures. The Report and Recommendations of ECVAM Workshops.ATLA23, 129-147.
(11)
ICCVAM (Interagency Coordinating Committee on the Validation of Alternative Methods). (1997). Validation and Regulatory Acceptance of Toxicological Test Methods. NIH Publication No 97-3981. National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA.
(12)
EC-ECVAM (1998). Statement on the Scientific Validity of the Rat Skin Transcutaneos Electrical Resistance (TER) Test (an In Vitro Test for Skin Corrosivity), Issued by the ECVAM Scientific Advisory Committee (ESAC10), 3 April 1998.
ICCVAM (Interagency Coordinating Committee on the Validation of Alternative Methods) (2002). ICCVAM Evaluation of EpiDermTM (EPI-200), EPISKINTM (SM), and the Rat Skin Transcutaneous Electrical Resistance (TER) Assay: In Vitro Test Methods for Assessing Dermal Corrosivity Potential of Chemicals. NIH Publication No 02-4502. National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA.
(15)
OECD (2015). Performance Standards for the Assessment of Proposed Similar or Modified In Vitro Transcutaneous Electrical Resistance (TER) Test Method for Skin Corrosion in Relation to TG 430. Environmental Health and Safety Publications, Series on Testing and Assessment No 218. Organisation for Economic Cooperation and Development, Paris.
(16)
Oliver G.J.A., Pemberton M.A., and Rhodes C. (1986). An In Vitro Skin Corrosivity Test -Modifications and Validation. Fd. Chem. Toxicol.24, 507-512.
(17)
Botham P.A., Hall T.J., Dennett R., McCall J.C., Basketter D.A., Whittle E., Cheeseman M., Esdaile D.J., and Gardner J. (1992). The Skin Corrosivity Test In Vitro: Results of an Interlaboratory Trial. Toxicol. In Vitro 6,191-194.
(18)
Eskes C., Detappe V., Koëter H., Kreysa J., Liebsch M., Zuang V., Amcoff P., Barroso J., Cotovio J., Guest R., Hermann M., Hoffmann S., Masson P., Alépée N., Arce L.A., Brüschweiler B., Catone T., Cihak R., Clouzeau J., D'Abrosca F., Delveaux C., Derouette J.P., Engelking O., Facchini D., Fröhlicher M., Hofmann M., Hopf N., Molinari J., Oberli A., Ott M., Peter R., Sá-Rocha V.M., Schenk D., Tomicic C., Vanparys P., Verdon B., Wallenhorst T., Winkler G.C. and Depallens O. (2012). Regulatory Assessment of In Vitro Skin Corrosion and Irritation Data Within the European Framework: Workshop Recommendations. Regul.Toxicol.Pharmacol. 62, 393-403.
(19)
TER SOP (December 2008). INVITTOX Protocol (No 115) Rat Skin Transcutaneous Electrical Resistance (TER) Test.
(20)
OECD (2005). Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 34), Organisation for Economic Cooperation and Devlopment, Paris.
Figure 1
Apparatus for the Rat Skin Ter Assay
Figure 2
Dimensions Of The Polytetrafluoroethylene (Ptfe) And Receptor Tubes And Electrodes Used
Critical factors of the apparatus shown above:
—
The inner diameter of the PTFE tube,
—
The length of the electrodes relative to the PTFE tube and receptor tube, such that the skin disc should not be touched by the electrodes and that a standard length of electrode is in contact with the MgSO4 solution,
—
The amount of MgSO4 solution in the receptor tube should give a depth of liquid, relative to the level in the PTFE tube, as shown in Figure 1,
—
The skin disc should be fixed well enough to the PTFE tube, such that the electrical resistance is a true measure of the skin properties.
Appendix
DEFINITIONS
Accuracy: The closeness of agreement between test method results and accepted reference values. It is a measure of test method performance and one aspect of relevance. The term is often used interchangeably with “concordance” to mean the proportion of correct outcomes of a test method (20).
C: Corrosive.
Chemical: A substance or a mixture.
Concordance: A measure of test method performance for test methods that give a categorical result, and is one aspect of relevance. The term is sometimes used interchangeably with accuracy, and is defined as the proportion of all chemicals tested that are correctly classified as positive or negative. Concordance is highly dependent on the prevalence of positives in the types of test chemical being examined (20).
GHS (Globally Harmonized System of Classification and Labelling of Chemicals (UN)): A system proposing the classification of chemicals (substances and mixtures) according to standardised types and levels of physical, health and environmental hazards, and addressing corresponding communication elements, such as pictograms, signal words, hazard statements, precautionary statements and safety data sheets, so that to convey information on their adverse effects with a view to protect people (including employers, workers, transporters, consumers and emergency responders) and the environment (1).
IATA: Integrated Approach on Testing and Assessment.
Mixture: A mixture or solution composed of two or more substances.
Mono-constituent substance: A substance, defined by its quantitative composition, in which one main constituent is present to at least 80 % (w/w).
Multi-constituent substance: A substance, defined by its quantitative composition, in which more than one main constituent is present in a concentration ≥ 10 % (w/w) and < 80 % (w/w). A multi-constituent substance is the result of a manufacturing process. The difference between mixture and multi-constituent substance is that a mixture is obtained by blending of two or more substances without chemical reaction. A multi-constituent substance is the result of a chemical reaction.
NC: Non corrosive.
OD: Optical Density.
PC: Positive Control, a replicate containing all components of a test system and treated with a substance known to induce a positive response. To ensure that variability in the positive control response across time can be assessed, the magnitude of the positive response should not be excessive.
Performance standards (PS): Standards, based on a validated test method, that provide a basis for evaluating the comparability of a proposed test method that is mechanistically and functionally similar. Included are; (i) essential test method components; (ii) a minimum list of Reference Chemicals selected from among the chemicals used to demonstrate the acceptable performance of the validated test method; and (iii) the similar levels of reliability and accuracy, based on what was obtained for the validated test method, that the proposed test method should demonstrate when evaluated using the minimum list of Reference Chemicals.
Relevance: Description of relationship of the test method to the effect of interest and whether it is meaningful and useful for a particular purpose. It is the extent to which the test method correctly measures or predicts the biological effect of interest. Relevance incorporates consideration of the accuracy (concordance) of a test method (20).
Reliability: Measures of the extent that a test method can be performed reproducibly within and between laboratories over time, when performed using the same protocol. It is assessed by calculating intra- and inter-laboratory reproducibility (20).
Sensitivity: The proportion of all positive/active chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results, and is an important consideration in assessing the relevance of a test method (20).
Skin corrosion in vivo: The production of irreversible damage of the skin; namely, visible necrosis through the epidermis and into the dermis, following the application of a test chemical for up to four hours. Corrosive reactions are typified by ulcers, bleeding, bloody scabs, and, by the end of observation at 14 days, by discoloration due to blanching of the skin, complete areas of alopecia, and scars. Histopathology should be considered to evaluate questionable lesions.
Specificity: The proportion of all negative/inactive chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results and is an important consideration in assessing the relevance of a test method (20).
Substance: A chemical element and its compounds in the natural state or obtained by any production process, including any additive necessary to preserve its stability and any impurities deriving from the process used, but excluding any solvent which may be separated without affecting the stability of the substance or changing it composition.
(Testing) run: A single test chemical concurrently tested in a minimum of three replicate skin discs.
Test chemical: Any substance or mixture tested using this test method.
Transcutaneous Electrical Resistance (TER): is a measure of the electrical impedance of the skin, as a resistance value in kilo Ohms. A simple and robust method of assessing barrier function by recording the passage of ions through the skin using a Wheatstone bridge apparatus.
UVCB: Substances of unknown or variable composition, complex reaction products or biological materials.
"
(6)
In Part B, Chapter B.40bis is replaced by the following:
"B.40bisIN VITRO SKIN CORROSION: RECONSTRUCTED HUMAN EPIDERMIS (RhE) TEST METHOD
INTRODUCTION
1.
This test method (TM) is equivalent to OECD test guideline (TG) 431 (2016). Skin corrosion refers to the production of irreversible damage to the skin manifested as visible necrosis through the epidermis and into the dermis, following the application of a test chemical [as defined by the United Nations (UN) Globally Harmonized System of Classification and Labelling of Chemicals (GHS) (1) and the European Union (EU) Regulation 1272/2008 on Classification, Labelling and Packaging of Substances and Mixtures (CLP) (6)]. This updated test method B.40bis provides an in vitro procedure allowing the identification of non-corrosive and corrosive substances and mixtures in accordance with UN GHS and CLP. It also allows a partial sub-categorisation of corrosives.
2.
The assessment of skin corrosion potential of chemicals has typically involved the use of laboratory animals (TM B.4, equivalent to OECD TG 404; originally adopted in 1981 and revised in 1992, 2002 and 2015) (2). In addition to the present test method B.40bis, two other in vitro test methods for testing corrosion potential of chemicals have been validated and adopted as TM B.40 (equivalent to OECD TG 430) (3) and TM B.65 (equivalent to OECD TG 435) (4). Furthermore the in vitro TM B.46 (equivalent to OECD TG 439) (5) has been adopted for testing skin irritation potential. A OECD guidance document on Integrated Approaches to Testing and Assessment (IATA) for Skin Corrosion and Irritation describes several modules which group information sources and analysis tools, and provides guidance on (i) how to integrate and use existing testing and non-testing data for the assessment of skin irritation and skin corrosion potentials of chemicals and (ii) proposes an approach when further testing is needed (6).
3.
This test method addresses the human health endpoint skin corrosion. It makes use of reconstructed human epidermis (RhE) (obtained from human derived non-transformed epidermal keratinocytes) which closely mimics the histological, morphological, biochemical and physiological properties of the upper parts of the human skin, i.e. the epidermis. The corresponding OECD test guideline was originally adopted in 2004 and updated in 2013 to include additional test methods using the RhE modelsand the possibility to use the methods to support the sub-categorisation of corrosive chemicals, and updated in 2015 to refer to the IATA guidance document and introduce the use of an alternative procedure to measure viability.
4.
Four validated commercially available RhE models are included in this test method. Prevalidation studies (7), followed by a formal validation study for assessing skin corrosion (8)(9)(10) have been conducted (11) (12) for two of these commercially available test models, EpiSkin™ Standard Model (SM) and EpiDerm™ Skin Corrosivity Test (SCT) (EPI-200) (referred to in the following text as the Validated Reference Methods - VRMs). The outcome of these studies led to the recommendation that the two VRMs mentioned above could be used for regulatory purposes for distinguishing corrosive (C) from non-corrosive (NC) substances, and that the EpiSkin™ could moreover be used to support sub-categorisation of corrosive substances (13)(14)(15). Two other commercially available in vitro skin corrosion RhE test models have shown similar results to the EpiDerm™ VRM according to PS-based validation (16)(17)(18). These are the SkinEthic™ RHE (7) and epiCS® (previously named EST-1000) that can also be used for regulatory purposes for distinguishing corrosive from noncorrosive substances (19)(20). Post validation studies performed by the RhE model producers in the years 2012 to 2014 with a refined protocol correcting interferences of unspecific MTT reduction by the test chemicals improved the performance of both discrimination of C/NC as well as supporting subcategorisation of corrosives (21)(22). Further statistical analyses of the post-validation data generated with EpiDerm™ SCT, SkinEthic™ RHE and EpiCS® have been performed to identify alternative predictions models that improved the predictive capacity for sub-categorisation (23).
5.
Before a proposed similar or modified in vitro RhE test method for skin corrosion other than the VRMs can be used for regulatory purposes, its reliability, relevance (accuracy), and limitations for its proposed use should be determined to ensure its similarity to the VRMs, in accordance with the requirements of the Performance Standards (PS) (24) set out in accordance with the principles of OECD guidance document No 34 (25). The Mutual Acceptance of Data will only be guaranteed after any proposed new or updated test method following the PS have been reviewed and included in the corresponding test guideline. The test models included in that test guideline can be used to address countries’ requirements for test results on in vitro test method for skin corrosion, while benefiting from the Mutual Acceptance of Data.
DEFINITIONS
6.
Definitions used are provided in Appendix 1.
INITIAL CONSIDERATIONS
7.
This test method allows the identification of non-corrosive and corrosive substances and mixtures in accordance with the UN GHS and CLP. This test method further supports the sub-categorisation of corrosive substances and mixtures into optional sub-category 1A, in accordance with the UN GHS (1), as well as a combination of sub-categories 1B and 1C (21)(22)(23). A limitation of this test method is that it does not allow discriminating between skin corrosive sub-category 1B and sub-category 1C in accordance with the UN GHS and CLP due to the limited set of well-known in vivo corrosive sub-category 1C chemicals. EpiSkin™, EpiDerm™ SCT, SkinEthic™ RHE and epiCS® test models are able to sub-categorise (i.e. 1A versus 1B-and-1C versus NC)
8.
A wide range of chemicals representing mainly individual substances has been tested in the validation supporting the test models included in this test method when they are used for identification of non-corrosives and corrosives; the empirical database of the validation study amounted to 60 chemicals covering a wide range of chemical classes (8)(9)(10). Testing to demonstrate sensitivity, specificity, accuracy and within-laboratory-reproducibility of the assay for sub-categorisation was performed by the test method developers and results were reviewed by the OECD (21) (22) (23). On the basis of the overall data available, the test method is applicable to a wide range of chemical classes and physical states including liquids, semi-solids, solids and waxes. The liquids may be aqueous or non-aqueous; solids may be soluble or insoluble in water. Whenever possible, solids should be ground to a fine powder before application; no other prior treatment of the sample is required. In cases where evidence can be demonstrated on the non-applicability of test models included in this test method to a specific category of test chemicals, they should not be used for that specific category of test chemicals. In addition, this test method is assumed to be applicable to mixtures as an extension of its applicability to substances. However, due to the fact that mixtures cover a wide spectrum of categories and composition, and that only limited information is currently available on the testing of mixtures, in cases where evidence can be demonstrated on the non-applicability of the test method to a specific category of mixtures (e.g. following a strategy as proposed in (26)), the test method should not be used for that specific category of mixtures. Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture. Gases and aerosols have not been assessed yet in validation studies (8)(9)(10). While it is conceivable that these can be tested using RhE technology, the current test method does not allow testing of gases and aerosols.
9.
Test chemicals absorbing light in the same range as MTT formazan and test chemicals able to directly reduce the vital dye MTT (to MTT formazan) may interfere with the tissue viability measurements and need the use of adapted controls for corrections. The type of adapted controls that may be required will vary depending on the type of interference produced by the test chemical and the procedure used to measure MTT formazan (see paragraphs 25-31).
10.
While this test method does not provide adequate information on skin irritation, it should be noted that TM B.46 specifically addresses the health effect skin irritation in vitro and is based on the same RhE test system, though using another protocol (5). For a full evaluation of local skin effects after a single dermal exposure, the OECD Guidance Document on Integrated Approaches for Testing and Assessment should be consulted (6). This IATA approach includes the conduct of in vitro tests for skin corrosion (such as described in this test method) and skin irritation before considering testing in living animals. It is recognised that the use of human skin is subject to national and international ethical considerations and conditions.
PRINCIPLE OF THE TEST
11.
The test chemical is applied topically to a three-dimensional RhE model, comprised of non- transformed, human-derived epidermal keratinocytes, which have been cultured to form a multi-layered, highly differentiated model of the human epidermis. It consists of organised basal, spinous and granular layers, and a multi-layered stratum corneum containing intercellular lamellar lipid layers representing main lipid classes analogous to those found in vivo.
12.
The RhE test method is based on the premise that corrosive chemicals are able to penetrate the stratum corneum by diffusion or erosion, and are cytotoxic to the cells in the underlying layers. Cell viability is measured by enzymatic conversion of the vital dye MTT [3-(4,5-Dimethylthiazol-2-yl)-2,5- diphenyltetrazolium bromide, Thiazolyl blue tetrazolium bromide; CAS number 298-93-1], into a blue formazan salt that is quantitatively measured after extraction from tissues (27). Corrosive chemicals are identified by their ability to decrease cell viability below defined threshold levels (see paragraphs 35 and 36). The RhE-based skin corrosion test method has shown to be predictive of in vivo skin corrosion effects assessed in rabbits according to the TM B.4 (2).
DEMONSTRATION OF PROFICIENCY
13.
Prior to routine use of any of the four validated RhE test models that adhere to this test method, laboratories should demonstrate technical proficiency by correctly classifying the twelve Proficiency Substances listed in Table 1. In case of the use of a method for sub-classification, also the correct sub-categorisation should be demonstrated. In situations where a listed substance is unavailable or where justifiable, another substance for which adequate in vivo and in vitro reference data are available may be used (e.g. from the list of reference chemicals (24)) provided that the same selection criteria as described in Table 1 is applied.
Combination of sub-categories 1B-and-1C In Vivo Corrosives
Glyoxylic acid monohydrate
563-96-2
Organic acid
1B-and-1C
(3) 1B-and-1C
—
S
Lactic acid
598-82-3
Organic acid
1B-and-1C
(3) 1B-and-1C
—
L
Ethanolamine
141-43-5
Organic base
1B
(3) 1B-and-1C
Y
Viscous
Hydrochloric acid (14,4 %)
7647-01-0
Inorganic acid
1B-and-1C
(3) 1B-and-1C
—
L
In Vivo Non Corrosives
Phenethyl bromide
103-63-9
Electrophile
NC
(3) NC
Y
L
4-Amino-1,2,4-triazole
584-13-4
Organic base
NC
(3) NC
—
S
4-(methylthio)-benzaldehyde
3446-89-7
Electrophile
NC
(3) NC
Y
L
Lauric acid
143-07-7
Organic acid
NC
(3) NC
—
S
Abbreviations: CASRN = Chemical Abstracts Service Registry Number; VRM = Validated Reference Method; NC = Not Corrosive; Y = yes; S = solid; L = liquid
14.
As part of the proficiency exercise, it is recommended that the user verifies the barrier properties of the tissues after receipt as specified by the RhE model manufacturer. This is particularly important if tissues are shipped over long distance/time periods. Once a test method has been successfully established and proficiency in its use has been demonstrated, such verification will not be necessary on a routine basis. However, when using a test method routinely, it is recommended to continue to assess the barrier properties in regular intervals.
PROCEDURE
15.
The following is a generic description of the components and procedures of the RhE test models for skin corrosion assessment covered by this test method. The RhE models endorsed as scientifically valid for use within this test method, i.e. the EpiSkin™ (SM), EpiDerm™ (EPI-200), SkinEthic™ RHE and epiCS® models (16)(17)(19)(28)(29)(30)(31)(32)(33), can be obtained from commercial sources. Standard Operating Procedures (SOPs) for these four RhE models are available (34)(35)(36)(37), and their main test method components are summarised in Appendix 2. It is recommended that the relevant SOP be consulted when implementing and using one of these models in the laboratory. Testing with the four RhE test models covered by this test method should comply with the following:
RHE TEST METHOD COMPONENTS
General Conditions
16.
Non-transformed human keratinocytes should be used to reconstruct the epithelium. Multiple layers of viable epithelial cells (basal layer, stratum spinosum, stratum granulosum) should be present under a functional stratum corneum. The stratum corneum should be multi-layered containing the essential lipid profile to produce a functional barrier with robustness to resist rapid penetration of cytotoxic benchmark chemicals, e.g. sodium dodecyl sulphate (SDS) or Triton X-100. The barrier function should be demonstrated and may be assessed either by determination of the concentration at which a benchmark chemical reduces the viability of the tissues by 50 % (IC50) after a fixed exposure time, or by determination of the exposure time required to reduce cell viability by 50 % (ET50) upon application of the benchmark chemical at a specified, fixed concentration (see paragraph 18). The containment properties of the RhE model should prevent the passage of material around the stratum corneum to the viable tissue, which would lead to poor modelling of skin exposure. The RhE model should be free of contamination by bacteria, viruses, mycoplasma, or fungi.
Functional Conditions
Viability
17.
The assay used for quantifying tissue viability is the MTT-assay (27). The viable cells of the RhE tissue construct reduce the vital dye MTT into a blue MTT formazan precipitate, which is then extracted from the tissue using isopropanol (or a similar solvent). The OD of the extraction solvent alone should be sufficiently small, i.e., OD < 0,1. The extracted MTT formazan may be quantified using either a standard absorbance (OD) measurement or an HPLC/UPLC-spectrophotometry procedure (38). The RhE model users should ensure that each batch of the RhE model used meets defined criteria for the negative control. An acceptability range (upper and lower limit) for the negative control OD values should be established by the RhE model developer/supplier. Acceptability ranges for the negative control OD values for the four validated RhE test models included in this test method are given in Table 2. An HPLC/UPLC- Spectrophotometry user should use the negative control OD ranges provided in Table 2 as the acceptance criterion for the negative control. It should be documented that the tissues treated with negative control are stable in culture (provide similar OD measurements) for the duration of the exposure period.
Table 2
Acceptability ranges for negative control OD values to control batch quality
Lower acceptance limit
Upper acceptance limit
EpiSkin™ (SM)
> 0,6
< 1,5
EpiDerm™ SCT (EPI-200)
> 0,8
< 2,8
SkinEthic™ RHE
> 0,8
< 3,0
epiCS®
> 0,8
< 2,8
Barrier function
18.
The stratum corneum and its lipid composition should be sufficient to resist the rapid penetration of certain cytotoxic benchmark chemicals (e.g. SDS or Triton X-100), as estimated by IC50 or ET50 (Table 3). The barrier function of each batch of the RhE model used should be demonstrated by the RhE model developer/vendor upon supply of the tissues to the end user (see paragraph 21).
Morphology
19.
Histological examination of the RhE model should be performed demonstrating multi-layered human epidermis-like structure containing stratum basale, stratum spinosum, stratum granulosum and stratum corneum and exhibits lipid profile similar to lipid profile of human epidermis. Histological examination of each batch of the RhE model used demonstrating appropriate morphology of the tissues should be provided by the RhE model developer/vendor upon supply of the tissues to the end user (see paragraph 21).
Reproducibility
20.
Test method users should demonstrate reproducibility of the test methods over time with the positive and negative controls. Furthermore, the test method should only be used if the RhE model developer/supplier provides data demonstrating reproducibility over time with corrosive and non-corrosive chemicals from e.g. the list of Proficiency Substances (Table 1). In case of the use of a test method for subcategorisation, the reproducibility with respect to sub-categorisation should also be demonstrated.
Quality control (QC)
21.
The RhE model should only be used if the developer/supplier demonstrates that each batch of the RhE model used meets defined production release criteria, among which those for viability (paragraph 17), barrier function (paragraph 18) and morphology (paragraph 19) are the most relevant. These data are provided to the test method users, so that they are able to include this information in the test report. Only results produced with QC accepted tissue batches can be accepted for reliable prediction of corrosive classification. An acceptability range (upper and lower limit) for the IC50 or the ET50 is established by the RhE model developer/supplier. The acceptability ranges for the four validated test models are given in Table 3.
Table 3
QC batch release criteria
Lower acceptance limit
Upper acceptance limit
EpiSkin™ (SM) (18 hours treatment withSDS) (33)
IC50 = 1,0 mg/ml
IC50 = 3,0 mg/ml
EpiDerm™ SCT (EPI-200) (1 % Triton X-100) (34)
ET50 = 4,0 hours
ET50 = 8,7 hours
SkinEthic™ RHE (1 % Triton X-100) (35)
ET50 = 4,0 hours
ET50 = 10,0 hours
epiCS® (1 % Triton X-100) (36)
ET50 = 2,0 hours
ET50 = 7,0 hours
Application of the Test Chemical and Control Chemicals
22.
At least two tissue replicates should be used for each test chemical and controls for each exposure time. For liquid as well as solid chemicals, sufficient amount of test chemical should be applied to uniformly cover the epidermis surface while avoiding an infinite dose, i.e. a minimum of 70 μl/cm2 or 30 mg/cm2 should be used. Depending on the models, the epidermis surface should be moistened with deionised or distilled waterbefore application of solid chemicals, to improve contact between the test chemical and the epidermis surface (34)(35)(36)(37). Whenever possible, solids should be tested as a fine powder. The application method should be appropriate for the test chemical (see e.g. references (34-37). At the end of the exposure period, the test chemical should be carefully washed from the epidermis with an aqueous buffer, or 0,9 % NaCl. Depending on which of the four validated RhE test model is used, two or three exposure periods are used per test chemical (for all four valid RhE models: 3 min and 1 hour; for EpiSkin™ an additional exposure time of 4 hours). Depending on the RhE test model used and the exposure period assessed, the incubation temperature during exposure may vary between room temperature and 37°C.
23.
Concurrent negative and positive controls (PC) should be used in each run to demonstrate that viability (with negative controls), barrier function and resulting tissue sensitivity (with the PC) of the tissues are within a defined historical acceptance range. The suggested PC chemicals are glacial acetic acid or 8N KOH depending upon the RhE model used. It should be noted that 8N KOH is a direct MTT reducer that might require adapted controls as described in paragraphs 25 and 26. The suggested negative controls are 0,9 % (w/v) NaCl or water.
Cell Viability Measurements
24.
The MTT assay, which is a quantitative assay, should be used to measure cell viability under this test method (27). The tissue sample is placed in MTT solution of appropriate concentration (0.3 or 1 mg/ml) for 3 hours. The precipitated blue formazan product is then extracted from the tissue using a solvent (e.g. isopropanol, acidic isopropanol), and the concentration of formazan is measured by determining the OD at 570 nm using a filter band pass of maximum ± 30 nm, or by an HPLC/UPLC- spectrophotometry procedure (see paragraphs 30 and 31)(38).
25.
Test chemicals may interfere with the MTT assay, either by direct reduction of the MTT into blue formazan, and/or by colour interference if the test chemical absorbs, naturally or due to treatment procedures, in the same OD range of formazan (570 ± 30 nm, mainly blue and purple chemicals). Additional controls should be used to detect and correct for a potential interference from these test chemicals such as the non-specific MTT reduction (NSMTT) control and the non-specific colour (NSC) control (see paragraphs 26 to 30). This is especially important when a specific test chemical is not completely removed from the tissue by rinsing or when it penetrates the epidermis, and is therefore present in the tissues when the MTT viability test is performed. Detailed description of how to correct direct MTT reduction and interferences by colouring agents is available in the SOPs for the test models (34)(35)(36)(37).
26.
To identify direct MTT reducers, each test chemical should be added to freshly prepared MTT medium (34) (35) (36) (37). If the MTT mixture containing the test chemical turns blue/purple, the test chemical is presumed to directly reduce the MTT, and further functional check on non-viable epidermis should be performed, independently of using the standard absorbance (OD) measurement or an HPLC/UPLC-spectrophotometry procedure. This additional functional check employs killed tissues that possess only residual metabolic activity but absorb the test chemical in similar amount as viable tissues. Each MTT reducing chemical is applied on at least two killed tissue replicates per exposure time, which undergo the whole skin corrosion test. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the MTT reducer minus the percent non-specific MTT reduction obtained with the killed tissues exposed to the same MTT reducer, calculated relative to the negative control run concurrently to the test being corrected (%NSMTT).
27.
To identify potential interference by coloured test chemicals or test chemicals that become coloured when in contact with water or isopropanol and decide on the need for additional controls, spectral analysis of the test chemical in water (environment during exposure) and/or isopropanol (extracting solution) should be performed. If the test chemical in water and/or isopropanol absorbs light in the range of 570 ± 30 nm, furthercolorant controls should be performed or, alternatively, an HPLC/UPLC- spectrophotometry procedure should be used in which case these controls are not required (see paragraphs 30 and 31). When performing the standard absorbance (OD) measurement, each interfering coloured test chemical is applied on at least two viable tissue replicates per exposure time, which undergo the entire skin corrosion test but are incubated with medium instead of MTT solution during the MTT incubation step to generate a non-specific colour (NSCliving) control. The NSCliving control needs to be performed concurrently per exposure time per coloured test chemical (in each run) due to the inherent biological variability of living tissues. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the interfering test chemical and incubated with MTT solution minus the percent nonspecific colour obtained with living tissues exposed to the interfering test chemical and incubated with medium without MTT, run concurrently to the test being corrected (%NSCliving).
28.
Test chemicals that are identified as producing both direct MTT reduction (see paragraph 26) and colour interference (see paragraph 27) will also require a third set of controls, apart from the NSMTT and NSCliving controls described in the previous paragraphs, when performing the standard absorbance (OD) measurement. This is usually the case with darkly coloured test chemicals interfering with the MTT assay (e.g., blue, purple, black) because their intrinsic colour impedes the assessment of their capacity to directly reduce MTT as described in paragraph 26. These test chemicals may bind to both living and killed tissues and therefore the NSMTT control may not only correct for potential direct MTT reduction by the test chemical, but also for colour interference arising from the binding of the test chemical to killed tissues. This could lead to a double correction for colour interference since the NSCliving control already corrects for colour interference arising from the binding of the test chemical to living tissues. To avoid a possible double correction for colour interference, a third control for non-specific colour in killed tissues (NSCkilled) needs to be performed. In this additional control, the test chemical is applied on at least two killed tissue replicates per exposure time, which undergo the entire testing procedure but are incubated with medium instead of MTT solution during the MTT incubation step. A single NSCkilled control is sufficient per test chemical regardless of the number of independent tests/runs performed, but should be performed concurrently to the NSMTT control and, where possible, with the same tissue batch. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the test chemical minus %NSMTT minus %NSCliving plus the percent non-specific colour obtained with killed tissues exposed to the interfering test chemical and incubated with medium without MTT, calculated relative to the negative control run concurrently to the test being corrected (%NSCkilled).
29.
It is important to note that non-specific MTT reduction and non-specific colour interferences may increase the readouts of the tissue extract above the linearity range of the spectrophotometer. On this basis, each laboratory should determine the linearity range of their spectrophotometer with MTT formazan (CAS # 57360-69-7) from a commercial source before initiating the testing of test chemicals for regulatory purposes. In particular, the standard absorbance (OD) measurement using a spectrophotometer is appropriate to assess direct MTT-reducers and colour interfering test chemicals when the ODs of the tissue extracts obtained with the test chemical without any correction for direct MTT reduction and/or colour interference are within the linear range of the spectrophotometer or when the uncorrected percent viability obtained with the test chemical already defined it as a corrosive (see paragraphs 35 and 36). Nevertheless, results for test chemicals producing %NSMTT and/or %NSCliVing > 50 % of the negative control should be taken with caution.
30.
For coloured test chemicals which are not compatible with the standard absorbance (OD) measurement due to too strong interference with the MTT assay, the alternative HPLC/UPLC- spectrophotometry procedure to measure MTT formazan may be employed (see paragraph 31) (37). The HPLC/UPLC-spectrophotometry system allows for the separation of the MTT formazan from the test chemical before its quantification (38). For this reason, NSCliVing or NSCkilled controls are never required when using HPLC/UPLC-spectrophotometry, independently of the chemical being tested. NSMTT controls should nevertheless be used if the test chemical is suspected to directly reduce MTT or has a colour that impedes the assessment of the capacity to directly reduce MTT (as described in paragraph 26). When using HPLC/UPLC-spectrophotometry to measure MTT formazan, the percent tissue viability is calculated as percent MTT formazan peak area obtained with living tissues exposed to the test chemical relative to the MTT formazan peak obtained with the concurrent negative control. For test chemicals able to directly reduce MTT, true tissue viability is calculated as the percent tissue viabilityobtained with living tissues exposed to the test chemical minus %NSMTT. Finally, it should be noted that direct MTT-reducers that may also be colour interfering, which are retained in the tissues after treatment and reduce MTT so strongly that they lead to ODs (using standard OD measurement) or peak areas (using UPLC/HPLC- spectrophotometry) of the tested tissue extracts that fall outside of the linearity range of the spectrophotometer cannot be assessed, although these are expected to occur in only very rare situations.
31.
HPLC/UPLC-spectrophotometry may be used also with all types of test chemicals (coloured, non-coloured, MTT-reducers and non-MTT reducers) for measurement of MTT formazan (38). Due to the diversity of HPLC/UPLC-spectrophotometry systems, qualification of the HPLC/UPLC- spectrophotometry system should be demonstrated before its use to quantify MTT formazan from tissue extracts by meeting the acceptance criteria for a set of standard qualification parameters based on those described in the U.S. Food and Drug Administration guidance for industry on bio-analytical method validation (38)(39). These key parameters and their acceptance criteria are shown in Appendix 4. Once the acceptance criteria defined in Appendix 4 have been met, the HPLC/UPLC-spectrophotometry system is considered qualified and ready to measure MTT formazan under the experimental conditions described in this test method.
Acceptability Criteria
32.
For each test method using valid RhE models, tissues treated with the negative control should exhibit OD reflecting the quality of the tissues as described in table 2 and should not be below historically established boundaries. Tissues treated with the PC, i.e. glacial acetic acid or 8N KOH, should reflect the ability of the tissues to respond to a corrosive chemical under the conditions of the test model (see Appendix 2). The variability between tissue replicates of test chemical and/or control chemicals should fall within the accepted limits for each valid RhE model requirements (see Appendix 2) (e.g. the difference of viability between the two tissue replicates should not exceed 30 %). If either the negative control or PC included in a run fall out of the accepted ranges, the run is considered as not qualified and should be repeated. If the variability of test chemicals falls outside of the defined range, its testing should be repeated.
Interpretation of Results and Prediction Model
33.
The OD values obtained for each test chemical should be used to calculate percentage of viability relative to the negative control, which is set at 100 %. In case HPLC/UPLC-spectrophotometry is used, the percent tissue viability is calculated as percent MTT formazan peak area obtained with living tissues exposed to the test chemical relative to the MTT formazan peak obtained with the concurrent negative control. The cut-off percentage cell viability values distinguishing corrosive from non-corrosive test chemical (or discriminating between different corrosive sub-categories) are defined below in paragraphs 35 and 36 for each of the test models covered by this test method and should be used for interpreting the results.
34.
A single testing run composed of at least two tissue replicates should be sufficient for a test chemical when the resulting classification is unequivocal. However, in cases of borderline results, such as non-concordant replicate measurements, a second run may be considered, as well as a third one in case of discordant results between the first two runs.
35.
The prediction model for the EpiSkin™ skin corrosion test model (9)(34)(22), associated with the UN GHS/CLP classification system, is shown in Table 4:
Table 4
EpiSkin™ prediction model
Viability measured after exposure time points (t=3, 60 and 240 minutes)
A combination of optional sub-categories 1B-and-1C
≥ 35 % after 240 min exposure
Non-corrosive
36.
The prediction models for the EpiDerm™ SCT (10)(23)(35), the SkinEthic™ RHE (17)(18) (23) (36), and the epiCS® (16)(23)(37) skin corrosion test models, associated with the UN GHS/CLP classification system, are shown in Table 5:
Table 5
EpiDerm™ SCT, SkinEthic™ RHE and epiCS®
Viability measured after exposure time points (t=3 and 60 minutes)
Prediction to be considered
STEP 1 for EpiDerm™ SCT, for SkinEthic™ RHE and epiCS®
< 50 % after 3 min exposure
Corrosive
≥ 50 % after 3 min exposure AND< 15 % after 60 min exposure
Corrosive
≥ 50 % after 3 min exposure AND≥ 15 % after 60 min exposure
Non-corrosive
STEP 2 for EpiDerm™ SCT - for substances/mixtures identified as Corrosive in step 1
< 25 % after 3 min exposure
Optional sub-category 1A *
≥ 25 % after 3 min exposure
A combination of optional sub-categories 1B and 1C
STEP 2 for SkinEthic™ RHE - for substances/mixtures identified as Corrosive in step 1
< 18 % after 3 min exposure
Optional sub-category 1A *
≥ 18 % after 3 min exposure
A combination of optional sub-categories 1B and 1C
STEP 2 for epiCS® - for substances/mixtures identified as Corrosive in step 1
< 15 % after 3 min exposure
Optional sub-category 1A *
≥ 15 % after 3 min exposure
A combination of optional sub-categories 1B and 1C
DATA AND REPORTING
Data
37.
For each test, data from individual tissue replicates (e.g. OD values and calculated percentage cell viability for each test chemical, including classification) should be reported in tabular form, including data from repeat experiments as appropriate. In addition, means and ranges of viability and CVs between tissue replicates for each test should be reported. Observed interactions with MTT reagent by direct MTT reducers or coloured test chemicals should be reported for each tested chemical.
Test report
38.
The test report should include the following information:
Test Chemical and Control Chemicals:
—
Mono-constituent substance: chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc;
—
Multi-constituent substance, UVCB and mixture: characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents;
—
Physical appearance, water solubility, and any additional relevant physicochemical properties;
—
Source, lot number if available;
—
Treatment of the test chemical/control substance prior to testing, if applicable (e.g. warming, grinding);
—
Stability of the test chemical, limit date for use, or date for re-analysis if known;
—
Storage conditions.
RhE model and protocol used and rationale for it (if applicable)
Test Conditions:
—
RhE model used (including batch number);
—
Calibration information for measuring device (e.g. spectrophotometer), wavelength and band pass (if applicable) used for quantifying MTT formazan, and linearity range of measuring device;
—
Description of the method used to quantify MTT formazan;
—
Description of the qualification of the HPLC/UPLC-spectrophotometry system, if applicable;
—
Complete supporting information for the specific RhE model used including its performance. This should include, but is not limited to:
i)
Viability;
ii)
Barrier function;
iii)
Morphology;
iv)
Reproducibility and predictive capacity;
v)
Quality controls (QC) of the model;
—
Reference to historical data of the model. This should include, but is not limited to acceptability of the QC data with reference to historical batch data;
—
Demonstration of proficiency in performing the test method before routine use by testing of the proficiency substances.
Test Procedure:
—
Details of the test procedure used (including washing procedures used after exposure period);
—
Doses of test chemical and control chemicals used;
—
Duration of exposure period(s) and temperature(s) of exposure;
—
Indication of controls used for direct MTT-reducers and/or colouring test chemicals, if applicable;
—
Number of tissue replicates used per test chemical and controls (PC, negative control, and NSMTT, NSCliving and NSCkilled, if applicable), per exposure time;
—
Description of decision criteria/prediction model applied based on the RhE model used;
—
Description of any modifications of the test procedure (including washing procedures).
Run and Test Acceptance Criteria:
—
Positive and negative control mean values and acceptance ranges based on historical data;
—
Acceptable variability between tissue replicates for positive and negative controls;
—
Acceptable variability between tissue replicates for test chemical.
Results:
—
Tabulation of data for individual test chemicals and controls, for each exposure period, each run and each replicate measurement including OD or MTT formazan peak area, percent tissue viability, mean percent tissue viability, differences between replicates, SDs and/or CVs if applicable;
—
If applicable, results of controls used for direct MTT-reducers and/or colouring test chemicals including OD or MTT formazan peak area, %NSMTT, %NSCliving, %NSCkilled, differences between tissue replicates, SDs and/or CVs (if applicable), and final correct percent tissue viability;
—
Results obtained with the test chemical(s) and control chemicals in relation to the defined run and test acceptance criteria;
—
Description of other effects observed;
—
The derived classification with reference to the prediction model/decision criteria used.
Discussion of the results
Conclusions
LITERATURE
(1)
UN (2013). United Nations Globally Harmonized System of Classification and Labelling of Chemicals (GHS). Fifth Revised Edition, UN New York and Geneva. Available at: http://www.unece.org/trans/danger/publi/ghs/ghs_rev05/05files_e.html
(2)
Chapter B.4 of this Annex, Acute Dermal Irritation, Corrosion.
(3)
Chapter B.40 of this Annex, In Vitro Skin Corrosion.
(4)
Chapter B.65 of this Annex, In Vitro Membrane Barrier Test Method.
(5)
Chapter B.46 of this Annex, In Vitro Skin Irritation: Reconstructed Human Epidermis Test Method.
(6)
OECD (2014). Guidance Document on Integrated Approaches to Testing and Assessment of Skin Irritation/Corrosion. Environment, Health and Safety Publications, Series on Testing and Assessment, (No 203) Organisation for Economic Cooperation and Development, Paris.
(7)
Botham P.A., Chamberlain M., Barratt M.D., Curren R.D., Esdaile D.J., Gardner J.R., Gordon V.C., Hildebrand B., Lewis R.W., Liebsch M., Logemann P., Osborne R., Ponec M., Regnier J.F., Steiling W., Walker A.P., and Balls M. (1995). A Prevalidation Study on In Vitro Skin Corrosivity Testing. The report and Recommendations of ECVAM Workshop 6. ATLA 23:219-255.
(8)
Barratt M.D., Brantom P.G., Fentem J.H., Gerner I., Walker A.P., and Worth A.P. (1998). The ECVAM International Validation Study on In Vitro Tests for Skin Corrosivity. 1. Selection and distribution of the Test Chemicals. Toxicol.In Vitro 12:471-482.
(9)
Fentem J.H., Archer G.E.B., Balls M., Botham P.A., Curren R.D., Earl L.K., Esdaile D.J., Holzhutter H.-G., and Liebsch M. (1998). The ECVAM International Validation Study on In Vitro Tests for SkinCorrosivity. 2. Results and Evaluation by the Management Team. Toxicol.in Vitro 12:483-524.
(10)
Liebsch M., Traue D., Barrabas C., Spielmann H., Uphill, P., Wilkins S., Wiemann C., Kaufmann T., Remmele M. and Holzhütter H. G. (2000). The ECVAM Prevalidation Study on the Use of EpiDerm for Skin Corrosivity Testing, ATLA 28: 371-401.
(11)
Balls M., Blaauboer B.J., Fentem J.H., Bruner L., Combes R.D., Ekwall B., Fielder R.J., Guillouzo A., Lewis R.W., Lovell D.P., Reinhardt C.A., Repetto G., Sladowski D., Spielmann H. et Zucco F. (1995). Practical Aspects of the Validation of Toxicity Test Procedures. The Report and Recommendations of ECVAM Workshops, ATLA 23:129-147.
(12)
ICCVAM (Interagency Coordinating Committee on the Validation of Alternative Methods) (1997). Validation and Regulatory Acceptance of Toxicological TestMethods. NIH Publication No 97-3981. National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA.
(13)
ICCVAM (Interagency Coordinating Committee on the Validation of Alternative Methods) (2002). ICCVAM evaluation of EpiDerm™ (EPI-200), EPISKIN™ (SM), and the Rat Skin Transcutaneous Electrical Resistance (TER) Assay: In Vitro Test Methods for Assessing Dermal Corrosivity Potential of Chemicals. NIH Publication No 02-4502. National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA.
(14)
EC-ECVAM (1998). Statement on the Scientific Validity of the EpiSkin™ Test (an In Vitro Test for Skin Corrosivity), Issued by the ECVAM Scientific Advisory Committee (ESAC10), 3 April 1998.
(15)
EC-ECVAM (2000). Statement on the Application of the EpiDerm™ Human Skin Model for Skin Corrosivity Testing, Issued by the ECVAM Scientific Advisory Committee (ESAC14), 21 March 2000.
(16)
Hoffmann J., Heisler E., Karpinski S., Losse J., Thomas D., Siefken W., Ahr H.J., Vohr H.W. and Fuchs H.W. (2005). Epidermal-Skin-Test 1000 (EST-1000)-A New Reconstructed Epidermis for In Vitro Skin Corrosivity Testing. Toxicol.In Vitro 19: 925-929.
(17)
Kandárová H., Liebsch M., Spielmann,H., Genschow E., Schmidt E., Traue D., Guest R., Whittingham A., Warren N, Gamer A.O., Remmele M., Kaufmann T., Wittmer E., De Wever B., and Rosdy M. (2006). Assessment of the Human Epidermis Model SkinEthic RHE for In Vitro Skin Corrosion Testing of Chemicals According to New OECD TG 431. Toxicol.In Vitro 20: 547-559.
(18)
Tornier C., Roquet M. and Fraissinette A.B. (2010). Adaptation of the Validated SkinEthic™ Reconstructed Human Epidermis (RHE) Skin Corrosion Test Method to 0,5 cm2 Tissue Sample. Toxicol. In Vitro 24: 1379-1385.
(19)
EC-ECVAM (2006). Statement on the Application of the SkinEthic™ Human Skin Model for Skin Corrosivity Testing, Issued by the ECVAM Scientific Advisory Committee (ESAC25), 17 November 2006.
(20)
EC-ECVAM (2009). ESAC Statement on the Scientific Validity of an In-Vitro Test Method for Skin Corrosivity Testing: the EST-1000, Issued by the ECVAM Scientific Advisory Committee (ESAC30), 12 June 2009.
(21)
OECD (2013). Summary Document on the Statistical Performance of Methods in OECD Test Guideline 431 for Sub-categorisation. Environment, Health, and Safety Publications, Series on Testing and Assessment (No 190). Organisation for Economic Cooperation and Development, Paris.
(22)
Alépée N., Grandidier M.H., and Cotovio J. (2014). Sub-Categorisation of Skin Corrosive Chemicals by the EpiSkin™ Reconstructed Human Epidermis Skin Corrosion Test Method According to UN GHS: Revision of OECD Test Guideline 431. Toxicol. In Vitro 28:131-145.
(23)
Desprez B., Barroso J., Griesinger C., Kandárová H., Alépée N., and Fuchs, H. (2015). Two Novel Prediction Models Improve Predictions of Skin Corrosive Sub-categories by Test Methods of OECD Test Guideline No 431. Toxicol. In Vitro 29:2055-2080.
(24)
OECD (2015). Performance Standards for the Assessment of Proposed Similar or Modified In Vitro Reconstructed Human Epidermis (RHE) Test Methods For Skin Corrosion in Relation to OECD TG 431. Environmental Health and Safety Publications, Series on Testing and Assessment (No 219). Organisation for Economic Cooperation and Development, Paris
(25)
OECD (2005). Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 34), Organisation for Economic Cooperation and Development, Paris.
(26)
Eskes C. et al. (2012). Regulatory Assessment of In Vitro Skin Corrosion and Irritation Data Within the European Framework: Workshop Recommendations. Regul.Toxicol.Pharmacol. 62:393-403.
(27)
Mosmann T. (1983). Rapid Colorimetric Assay for Cellular Growth and Survival: Application to Proliferation and Cytotoxicity Assays. J. Immunol. Methods 65:55-63.
(28)
Tinois E., et al. (1994). The Episkin Model: Successful Reconstruction of Human Epidermis In Vitro. In: In Vitro Skin Toxicology. Rougier A.,. Goldberg A.M and Maibach H.I. (Eds): 133-140.
(29)
Cannon C. L., Neal P.J., Southee J.A., Kubilus J. and Klausner M. (1994), New Epidermal Model for Dermal Irritancy Testing. Toxicol.in Vitro 8:889 - 891.
(30)
Ponec M., Boelsma E, Weerheim A, Mulder A, Bouwstra J and Mommaas M. (2000). Lipid and Ultrastructural Characterization of Reconstructed Skin Models. Inter. J. Pharmaceu. 203:211 - 225.
(31)
Tinois E., Tillier, J., Gaucherand, M., Dumas, H., Tardy, M. and Thivolet J. (1991). In Vitro and Post - Transplantation Differentiation of Human Keratinocytes Grown on the Human Type IV Collagen Film of a Bilayered Dermal Substitute. Exp. Cell Res. 193:310-319.
(32)
Parenteau N.L., Bilbo P, Nolte CJ, Mason VS and Rosenberg M. (1992). The Organotypic Culture of Human Skin Keratinocytes and Fibroblasts to Achieve Form and Function. Cytotech. 9:163-171.
(33)
Wilkins L.M., Watson SR, Prosky SJ, Meunier SF and Parenteau N.L. (1994). Development of a Bilayered Living Skin Construct for Clinical Applications. Biotech. Bioeng. 43/8:747-756.
EpiDerm™ SOP (February 2012). Version MK-24-007-0024 Protocol for: In Vitro EpiDerm™ Skin Corrosion Test (EPI-200-SCT), for Use with MatTek Corporation’s Reconstructed Human Epidermal Model EpiDerm.
EpiCS® SOP (January 2012). Version 4.1 In Vitro Skin Corrosion: Human Skin Model Test Epidermal Skin Test 1000 (epiCS®) CellSystems.
(38)
Alépée N., Barroso J., De Smedt A., De Wever B., Hibatallah J., Klaric M., Mewes K.R., Millet M., Pfannenbecker U., Tailhardat M., Templier M., and McNamee P. Use of HPLC/UPLC- spectrophotometry for Detection of MTT Formazan in In Vitro Reconstructed Human Tissue (RhT)- based Test Methods Employing the MTT Assay to Expand their Applicability to Strongly Coloured Test Chemicals. Toxicol. In Vitro 29: 741-761.
(39)
US FDA (2001). Guidance for Industry: Bioanalytical Method Validation. U.S. Department of Health and Human Services, Food and Drug Administration. (May 2001). Available at: [http://www.fda.gov/downloads/Drugs/Guidances/ucm070107.pdf].
Appendix 1
DEFINITIONS
Accuracy: The closeness of agreement between test method results and accepted reference values. It is a measure of test method performance and one aspect of relevance. The term is often used interchangeably with “concordance” to mean the proportion of correct outcomes of a test method (25).
Cell viability: Parameter measuring total activity of a cell population e.g. as ability of cellular mitochondrial dehydrogenases to reduce the vital dye MTT (3-(4,5-Dimethylthiazol-2-yl)-2,5- diphenyltetrazolium bromide, Thiazolyl blue), which depending on the endpoint measured and the test design used, correlates with the total number and/or vitality of living cells.
Chemical: A substance or a mixture.
Concordance: This is a measure of test method performance for test methods that give a categorical result, and is one aspect of relevance. The term is sometimes used interchangeably with accuracy, and is defined as the proportion of all chemicals tested that are correctly classified as positive or negative. Concordance is highly dependent on the prevalence of positives in the types of test chemical being examined (25).
ET50: Can be estimated by determination of the exposure time required to reduce cell viability by 50 % upon application of the benchmark chemical at a specified, fixed concentration, see also IC50.
GHS (Globally Harmonized System of Classification and Labelling of Chemicals): A system proposing the classification of chemicals (substances and mixtures) according to standardized types and levels of physical, health and environmental hazards, and addressing corresponding communication elements, such as pictograms, signal words, hazard statements, precautionary statements and safety data sheets, so that to convey information on their adverse effects with a view to protect people (including employers, workers, transporters, consumers and emergency responders) and the environment (1).
HPLC: High Performance Liquid Chromatography.
IATA: Integrated Approach on Testing and Assessment.
IC50: Can be estimated by determination of the concentration at which a benchmark chemical reduces the viability of the tissues by 50 % (IC50) after a fixed exposure time, see also ET50.
Infinite dose: Amount of test chemical applied to the epidermis exceeding the amount required to completely and uniformly cover the epidermis surface.
Mixture: A mixture or solution composed of two or more substances in which they do not react.
Mono-constituent substance: A substance, defined by its quantitative composition, in which one main constituent is present to at least 80 % (w/w).
MTT: 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide; Thiazolyl blue tetrazolium bromide.
Multi-constituent substance: A substance, defined by its quantitative composition, in which more than one main constituent is present in a concentration > 10 % (w/w) and < 80 % (w/w). A multi-constituent substance is the result of a manufacturing process. The difference between mixture and multi-constituent substance is that a mixture is obtained by blending of two or more substances without chemical reaction. A multi-constituent substance is the result of a chemical reaction.
NC: Non corrosive.
NSCkilled control: Non-Specific Colour control in killed tissues.
NSCliving control: Non-Specific Colour control in living tissues.
NSMTT: Non-Specific MTT reduction.
OD: Optical Density
PC: Positive Control, a replicate containing all components of a test system and treated with a chemical known to induce a positive response. To ensure that variability in the positive control response across time can be assessed, the magnitude of the positive response should not be excessive.
Performance standards (PS): Standards, based on a validated test method, that provide a basis for evaluating the comparability of a proposed test method that is mechanistically and functionally similar. Included are; (i) essential test method components; (ii) a minimum list of Reference Chemicals selected from among the chemicals used to demonstrate the acceptable performance of the validated test method; and (iii) the similar levels of reliability and accuracy, based on what was obtained for the validated test method, that the proposed test method should demonstrate when evaluated using the minimum list of Reference Chemicals (25).
Relevance: Description of relationship of the test method to the effect of interest and whether it is meaningful and useful for a particular purpose. It is the extent to which the test method correctly measures or predicts the biological effect of interest. Relevance incorporates consideration of the accuracy (concordance) of a test method (25).
Reliability: Measures of the extent that a test method can be performed reproducibly within and between laboratories over time, when performed using the same protocol. It is assessed by calculating intra- and inter-laboratory reproducibility (25).
Run: A run consists of one or more test chemicals tested concurrently with a negative control and with a PC.
Sensitivity: The proportion of all positive/active chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results, and is an important consideration in assessing the relevance of a test method (25).
Skin corrosion in vivo: The production of irreversible damage of the skin; namely, visible necrosis through the epidermis and into the dermis, following the application of a test chemical for up to four hours. Corrosive reactions are typified by ulcers, bleeding, bloody scabs, and, by the end of observation at 14 days, by discoloration due to blanching of the skin, complete areas of alopecia, and scars. Histopathology should be considered to evaluate questionable lesions.
Specificity: The proportion of all negative/inactive chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results and is an important consideration in assessing the relevance of a test method (25).
Substance: A chemical element and its compounds in the natural state or obtained by any production process, inducing any additive necessary to preserve its stability and any impurities deriving from the process used, but excluding any solvent which may be separated without affecting the stability of the substance or changing it composition.
Test chemical: Any substance or mixture tested using this test method.
→ if solution turns blue/purple, freeze-killed adapted controls should be performed
Pre-check for colour interference
10 μl (liquid) or 10 mg (solid) + 90 μl H2O mixed for 15 min at RT
→ if solution becomes coloured, living adapted controls should be performed
50 μl (liquid) or 25 mg (solid) + 300 μl H2O for 60 min at 37oC, 5 % CO2, 95 % RH
→ if solution becomes coloured, living adapted controls should be performed
40 μl (liquid) or 20mg (solid) + 300 μl H2O mixed for 60 min at RT
→ if test chemical is coloured, living adapted controls should be performed
50 μl (liquid) or 25 mg (solid) + 300 μl H2O for 60 min at 37oC, 5 % CO2, 95 % RH
→ if solution becomes coloured, living adapted controls should be performed
Exposure time and temperature
3 min, 60 min (± 5 min) and 240 min (± 10 min)
In ventilated cabinet Room Temperature (RT, 18-28 oC)
3 min at RT, and 60 min at 37 oC, 5 % CO2, 95 % RH
3 min at RT, and 60 min at 37 oC, 5 % CO2, 95 % RH
3 min at RT, and 60 min at 37 oC, 5 % CO2, 95 % RH
Rinsing
25 ml 1x PBS (2 ml/throwing)
20 times with a constant soft stream of 1x PBS
20 times with a constant soft stream of 1x PBS
20 times with a constant soft stream of 1x PBS
Negative control
50 μl NaCl solution (9 g/l)
Tested with every exposure time
50 μl H2O
Tested with every exposure time
40 μl H2O
Tested with every exposure time
50 μl H2O
Tested with every exposure time
Positive control
50 μl Glacial acetic acid
Tested only for 4 hours
50 μl 8N KOH
Tested with every exposure time
40 μl 8N KOH
Tested only for 1 hour
50 μl 8N KOH
Tested with every exposure time
MTT solution
2 ml 0,3 mg/ml
300 μl 1 mg/ml
300 μl 1 mg/ml
300 μl 1 mg/ml
MTT incubation time and temperature
180 min (± 15 min) at 37oC, 5 % CO2, 95 % RH
180 min at 37oC, 5 % CO2, 95 % RH
180 min (± 15 min) at 37oC, 5 % CO2, 95 % RH
180 min at 37oC, 5 % CO2, 95 % RH
Extraction solvent
500 μl acidified isopropanol
(0.04 N HCl in isopropanol)
(isolated tissue fully immersed)
2 ml isopropanol
(extraction from top and bottom of insert)
1.5 ml isopropanol
(extraction from top and bottom of insert)
2 ml isopropanol
(extraction from top and bottom of insert)
Extraction time and temperature
Overnight at RT, protected from light
Overnight without shaking at RT or for 120 min with shaking (~120 rpm) at RT
Overnight without shaking at RT or for 120 min with shaking (~120 rpm) at RT
Overnight without shaking at RT or for 120 min with shaking (~120 rpm) at RT
OD reading
570 nm (545 - 595 nm) without reference filter
570 nm (or 540 nm) without reference filter
570 nm (540 - 600 nm) without reference filter
540 - 570 nm without reference filter
Tissue Quality Control
18 hours treatment with SDS
1.0 mg/ml ≤ IC50 ≤ 3.0 mg/ml
Treatment with 1 % Triton X-100
4.08 hours ≤ ET50 ≤ 8.7 hours
Treatment with 1 % Triton X-100
4.0 hours ≤ ET50 ≤ 10.0 hours
Treatment with 1 % Triton X-100
2.0 hours ≤ ET50 ≤ 7.0 hours
Acceptability Criteria
1.
Mean OD of the tissue replicates treated with the negative control (NaCl) should be ≥ 0.6 and ≤ 1.5 for every exposure time
2.
Mean viability of the tissue replicates exposed for 4 hours with the positive control (glacial acetic acid), expressed as % of the negative control, should be ≤ 20 %
3.
In the range 20-100 % viability and for ODs ≥ 0.3, difference of viability between the two tissue replicates should not exceed 30 %.
1.
Mean OD of the tissue replicates treated with the negative control (H2O) should be ≥ 0.8 and ≤ 2.8 for every exposure time
2.
Mean viability of the tissue replicates exposed for 1 hour with the positive control (8N KOH), expressed as % of the negative control, should be < 15 %
3.
In the range 20 - 100 % viability, the Coefficient of Variation (CV) between tissue replicates should be 30 %
1.
Mean OD of the tissue replicates treated with the negative control (H2O) should be ≥ 0.8 and ≤ 3.0 for every exposure time
2.
Mean viability of the tissue replicates exposed for 1 hour (and 4 hours, if applicable) with the positive control (8N KOH), expressed as % of the negative control, should be 15 %
3.
In the range 20-100 % viability, and for ODs ≥ 0.3, difference of viability between the two tissue replicates should not exceed 30 %
1.
Mean OD of the tissue replicates treated with the negative control (H2O) should be ≥ 0.8 and ≤ 2.8 for every exposure time
2.
Mean viability of the tissue replicates exposed for 1 hour with the positive control (8N KOH), expressed as % of the negative control, should be 20 %
3.
In the range 20-100 % viability, and for ODs ≥ 0.3, difference of viability between the two tissue replicates should not exceed 30 %
Appendix 3
PERFORMANCE OF TEST MODELS FOR SUB-CATEGORISATION
The table below provides the performances of the four test models calculated based on a set of 80 chemicals tested by the four test developers. Calculations were performed by the OECD Secretariat, reviewed and agreed by an expert subgroup (21) (23).
EpiSkin™, EpiDerm™,SkinEthic™ and epiCS® test models are able to sub-categorise (i.e. 1A versus 1B-and-1C versus NC)
Performances, overclassification rates, underclassification rates, and accuracy (Predictive capacity) of the four test models based on a set of 80 chemicals all tested over 2 or 3 runs in each test model:
STATISTICS ON PREDICTIONS OBTAINED ON THE ENTIRE SET OF CHEMICALS
(n= 80 chemicals tested over 2 independent runs for epiCS® or 3 independent runs for EpiDerm™ SCT, EpiSkin™ and SkinEthic™ RHE, i.e. respectively 159 (*2) or 240 classifications)
EpiSkin™
EpiDerm™
SkinEthic™
epiCS®
Overclassifications:
1B-and-1C overclassified 1A
21,50 %
29,0 %
31,2 %
32,8 %
NC overclassified 1B-and-1C
20,7 %
23,4 %
27,0 %
28,4 %
NC overclassified 1A
0,00 %
2,7 %
0,0 %
0,00 %
overclassified Corr.
20,7 %
26,1 %
27,0 %
28,4 %
Global overclassification rate (all categories)
17,9 %
23,3 %
24,5 %
25,8 %
Underclassifications:
1A underclassified 1B-and-1C
16,7 %
16,7 %
16,7 %
12,5 %
1A underclassified NC
0,00 %
0,00 %
0,00 %
0,00 %
1B-and-1C underclassified NC
2,2 %
0,00 %
7,5 %
6,6 %
Global underclassification rate (all categories)
3,3 %
2,5 %
5,4 %
4,4 %
Correct Classifications:
1A correctly classified
83,3 %
83,3 %
83,3 %
87,5 %
1B-and-/1C correctly classified
76,3 %
71,0 %
61,3 %
60,7 %
NC correctly classified
79,3 %
73,9 %
73,0 %
71,62 %
Overall Accuracy
78,8 %
74,2 %
70 %
69,8 %
NC: Non-corrosive
Appendix 4
Key parameters and acceptance criteria for qualification of an HPLC/UPLC-spectrophotometry system for measurement of MTT formazan extracted from RhE tissue
Parameter
Protocol Derived from FDA Guidance (37)(38)
Acceptance Criteria
Selectivity
Analysis of isopropanol, living blank (isopropanol extract from living RhE tissues without any treatment), dead blank (isopropanol extract from killed RhE tissues without any treatment)
Quality Controls (i.e., MTT formazan at 1,6 μg/ml, 16 μg/ml and 160 μg/ml) in isopropanol (n=5)
CV ≤ 15 % or ≤ 20 % for the LLOQ
Accuracy
Quality Controls in isopropanol (n=5)
%Dev ≤ 15 % or ≤ 20 % for LLOQ
Matrix Effect
Quality Controls in living blank (n=5)
85 % ≤ Matrix Effect % ≤ 115 %
Carryover
Analysis of isopropanol after an ULOQ (14) standard
Areainterference ≤ 20 % of AreaLLOQ
Reproducibility (intra-day)
3 independent calibration curves (based on 6 consecutive 1/3 dilutions of MTT formazan in isopropanol starting at ULOQ, i.e., 200 μg/ml);
Quality Controls in isopropanol (n=5)
Calibration Curves: %Dev ≤ 15 % or ≤ 20 % for LLOQ
Quality Controls: %Dev ≤ 15 % and CV ≤ 15 %
Reproducibility (inter-day)
Day 1
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Day 2
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Day 3
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Short Term Stability of MTT Formazan in RhE Tissue Extract
Quality Controls in living blank (n=3) analysed the day of the preparation and after 24 hours of storage at room temperature
%Dev ≤ 15 %
Long Term Stability of MTT Formazan in RhE Tissue Extract, if required
Quality Controls in living blank (n=3) analysed the day of the preparation and after several days of storage at a specified temperature (e.g., 4 °C, –20 °C, –80 °C)
%Dev ≤ 15 %
"
(7)
In Part B, Chapter B.46 is replaced by the following:
"B.46 IN VITRO SKIN IRRITATION: RECONSTRUCTED HUMAN EPIDERMIS TEST METHOD
INTRODUCTION
1.
This test method (TM) is equivalent to OECD test guideline (TG) 439 (2015). Skin irritation refers to the production of reversible damage to the skin following the application of a test chemical for up to 4 hours [as defined by the United Nations (UN) Globally Harmonized System of Classification and Labelling of Chemicals (GHS)](1) and the European Union (EU) Regulation 1272/2008 on Classification, Labelling and Packaging of Substances and Mixtures (CLP) (15). This test method provides an in vitro procedure that may be used for the hazard identification of irritant chemicals (substances and mixtures) in accordance with UN GHS/CLP Category 2 (2). In regions that do not adopt the optional UN GHS Category 3 (mild irritants), this test method can also be used to identify non-classified chemicals. Therefore, depending on the regulatory framework and the classification system in use, this test method may be used to determine the skin irritancy of chemicals either as a stand-alone replacement test for in vivo skin irritation testing or as a partial replacement test within a testing strategy (3).
2.
The assessment of skin irritation has typically involved the use of laboratory animals [TM B.4, equivalent to OECD TG 404 originally adopted in 1981 and revised in 1992, 2002 and 2015] (4). For the testing of corrosivity, three validated in vitro test methods have been adopted as EU TM B.40 (equivalent to OECD TG 430), TM B.40bis (equivalent to OECD TG 431) and TM B.65 (equivalent to OECD TG 435) (5) (6) (7). An OECD guidance document on Integrated Approaches to Testing and Assessment (IATA) for Skin Corrosion and Irritation describes several modules which group information sources and analysis tools, and provides guidance on (i) how to integrate and use existing test and non-test data for the assessment of skin irritation and skin corrosion potentials of chemicals and (ii) proposes an approach when further testing is needed (3).
3.
This test method addresses the human health endpoint skin irritation. It is based on the in vitro test system of reconstructed human epidermis (RhE), which closely mimics the biochemical and physiological properties of the upper parts of the human skin, i.e. the epidermis. The RhE test system uses human derived non-transformed keratinocytes as cell source to reconstruct an epidermal model with representative histology and cytoarchitecture. Performance Standards (PS) are available to facilitate the validation and assessment of similar and modified RhE-based test methods, in accordance with the principles of the OECD guidance document No 34 (8) (9). The corresponding test guideline was originally adopted in 2010, updated in 2013 to include additional RhE models, and updated in 2015 to refer to the IATA guidance document and introduce the use of an alternative procedure to measure viability.
4.
Pre-validation, optimisation and validation studies have been completed for four commercially available in vitro test models (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) based on the RhE test system (sensitivity 80 %, specificity 70 %, and accuracy 75 %). These four test models are included in this TM and are listed in Appendix 2, which also provides information on the type of validation study used to validate the respective test methods. As noted in Appendix 2, the Validated Reference Method (VRM) have been used to develop the present test method and the Performance Standards (8).
5.
OECD Mutual Acceptance of Data will only be guaranteed for test models validated according to the Performance Standards (8), if these test models have been reviewed and adopted by OECD. The test models included in this test method and the corresponding OECD TG can be used indiscriminately to address countries’ requirements for test results from in vitro test methods for skin irritation, while benefiting from the Mutual Acceptance of Data.
6.
Definitions of terms used in this document are provided in Appendix 1.
INITIAL CONSIDERATIONS AND LIMITATIONS
7.
A limitation of the test method, as demonstrated by the full prospective validation study assessing and characterising RhE test methods (16), is that it does not allow the classification of chemicals to the optional UN GHS Category 3 (mild irritants) (1). Thus, the regulatory framework in member countries will decide how this test method will be used. For the EU, Category 3 has not been taken up in CLP. For a full evaluation of local skin effects after a single dermal exposure, the OECD Guidance Document on Integrated Approaches for Testing and Assessment should be consulted (3). It is recognised that the use of human skin is subject to national and international ethical considerations and conditions.
8.
This test method addresses the human health endpoint skin irritation. While this test method does not provide adequate information on skin corrosion, it should be noted that TM B.40bis (equivalent to OECD TG 431) on skin corrosion is based on the same RhE test system, though using another protocol (6). This test method is based on RhE-models using human keratinocytes, which therefore represent in vitro the target organ of the species of interest. It moreover directly covers the initial step of the inflammatory cascade/mechanism of action (cell and tissue damage resulting in localised trauma) that occurs during irritation in vivo. A wide range of chemicals has been tested in the validation underlying this test method and the database of the validation study amounted to 58 chemicals in total (16) (18) (23). The test method is applicable to solids, liquids, semi-solids and waxes. The liquids may be aqueous or non-aqueous; solids may be soluble or insoluble in water. Whenever possible, solids should be ground to a fine powder before application; no other pre-treatment of the sample is required. Gases and aerosols have not been assessed yet in a validation study (29). While it is conceivable that these can be tested using RhE technology, the current test method does not allow testing of gases and aerosols.
9.
Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture. However, due to the fact that mixtures cover a wide spectrum of categories and composition, and that only limited information is currently available on the testing of mixtures, in cases where evidence can be demonstrated on the non-applicability of the test method to a specific category of mixtures (e.g. following a strategy as proposed in Eskes et al. 2012 (30)), the test method should not be used for that specific category of mixtures. Similar care should be taken in case specific chemical classes or physico-chemical properties are found not to be applicable to the current test method.
10.
Test chemicals absorbing light in the same range as MTT formazan and test chemicals able to directly reduce the vital dye MTT (to MTT formazan) may interfere with the cell viability measurements and need the use of adapted controls for corrections (see paragraphs 28-34).
11.
A single testing run composed of three replicate tissues should be sufficient for a test chemical when the classification is unequivocal. However, in cases of borderline results, such as non-concordant replicate measurements and/or mean percent viability equal to 50 ± 5 %, a second run should be considered, as well as a third one in case of discordant results between the first two runs.
PRINCIPLE OF THE TEST
12.
The test chemical is applied topically to a three-dimensional RhE model, comprised of non-transformed human-derived epidermal keratinocytes, which have been cultured to form a multilayered, highly differentiated model of the human epidermis. It consists of organised basal, spinous and granular layers, and a multilayered stratum corneum containing intercellular lamellar lipid layers representing main lipid classes analogous to those found in vivo.
13.
Chemical-induced skin irritation, manifested mainly by erythema and oedema, is the result of a cascade of events beginning with penetration of the chemicals through the stratum corneum where they may damage the underlying layers of keratinocytes and other skin cells. The damaged cells may either release inflammatory mediators or induce an inflammatory cascade which also acts on the cells in the dermis, particularly the stromal and endothelial cells of the blood vessels. It is the dilation and increased permeability of the endothelial cells that produce the observed erythema and oedema (29). Notably, the RhE-based test methods, in the absence of any vascularisation in the in vitro test system, measure the initiating events in the cascade, e.g. cell / tissue damage (16) (17), using cell viability as readout.
14.
Cell viability in RhE models is measured by enzymatic conversion of the vital dye MTT [3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, Thiazolyl blue; CAS number 298-93-1], into a blue formazan salt that is quantitatively measured after extraction from tissues (31). Irritant chemicals are identified by their ability to decrease cell viability below defined threshold levels (i.e. ≤ 50 %, for UN GHS/CLP Category 2). Depending on the regulatory framework and applicability of the test method, test chemicals that produce cell viabilities above the defined threshold level, may be considered non-irritants (i.e. > 50 %, No Category).
DEMONSTRATION OF PROFICIENCY
15.
Prior to routine use of any of the four validated test models that adhere to this test method (Appendix 2), laboratories should demonstrate technical proficiency, using the ten Proficiency Substances listed in Table 1. In situations where, for instance, a listed substance is unavailable, another substance for which adequate in vivo and in vitro reference data are available may be used (e.g. from the list of reference chemicals (8)) provided that the same selection criteria as described in Table 1 are applied. Using an alternative proficiency substance should be justified.
16.
As part of the proficiency testing, it is recommended that users verify the barrier properties of the tissues after receipt as specified by the RhE model producer. This is particularly important if tissues are shipped over long distance/time periods. Once a test method has been successfully established and proficiency in its use has been acquired and demonstrated, such verification will not be necessary on a routine basis. However, when using a test method routinely, it is recommended to continue to assess the barrier properties at regular intervals.
The following is a description of the components and procedures of a RhE test method for skin irritation assessment (See also Appendix 3 for parameters related to each test model). Standard Operating Procedures (SOPs) for the four models complying with this test method are available (32) (33) (34) (35).
RHE TEST METHOD COMPONENTS
General conditions
18.
Non -transformed human keratinocytes should be used to reconstruct the epithelium. Multiple layers of viable epithelial cells (basal layer, stratum spinosum, stratum granulosum) should be present under a functional stratum corneum. Stratum corneum should be multilayered containing the essential lipid profile to produce a functional barrier with robustness to resist rapid penetration of cytotoxic benchmark chemicals, e.g. sodium dodecyl sulphate (SDS) or Triton X-100. The barrier function should be demonstrated and may be assessed either by determination of the concentration at which a benchmark chemical reduces the viability of the tissues by 50 % (IC50) after a fixed exposure time, or by determination of the exposure time required to reduce cell viability by 50 % (ET50) upon application of the benchmark chemical at a specified, fixed concentration. The containment properties of the RhE model should prevent the passage of material around the stratum corneum to the viable tissue, which would lead to poor modelling of skin exposure. The RhE model should be free of contamination by bacteria, viruses, mycoplasma, or fungi.
Functional conditions
Viability
19.
The assay used for quantifying viability is the MTT-assay (31). The viable cells of the RhE tissue construct can reduce the vital dye MTT into a blue MTT formazan precipitate which is then extracted from the tissue using isopropanol (or a similar solvent). The optical density (OD) of the extraction solvent alone should be sufficiently small, i.e. OD< 0.1. The extracted MTT formazan may be quantified using either a standard absorbance (OD) measurement or an HPLC/UPLC-spectrophotometry procedure (36). The RhE model users should ensure that each batch of the RhE model used meets defined criteria for the negative control. An acceptability range (upper and lower limit) for the negative control OD values (in the Skin Irritation test method conditions) are established by the RhE model developer/supplier. Acceptability ranges for the four validated RhE models included in this test method are given in Table 2. An HPLC/UPLC-Spectrophotometry user should use the negative control OD ranges provided in Table 2 as the acceptance criterion for the negative control. It should be documented that the tissues treated with the negative control are stable in culture (provide similar viability measurements) for the duration of the test exposure period.
Table 2
Acceptability ranges for negative control OD values of the test models included in this TM
Lower acceptance limit
Upper acceptance limit
EpiSkin™ (SM)
≥ 0,6
≤ 1,5
EpiDerm™ SIT (EPI-200)
≥ 0,8
≤ 2,8
SkinEthic™ RHE
≥ 0,8
≤ 3,0
LabCyte EPI-MODEL24 SIT
≥ 0,7
≤ 2,5
Barrier function
20.
The stratum corneum and its lipid composition should be sufficient to resist the rapid penetration of cytotoxic benchmark chemicals, e.g. SDS or Triton X-100, as estimated by IC50 or ET50 (Table 3).
Morphology
21.
Histological examination of the RhE model should be provided demonstrating human epidermis-like structure (including multilayered stratum corneum).
Reproducibility
22.
The results of the positive and negative controls of the test method should demonstrate reproducibility over time.
Quality control (QC)
23.
The RhE model should only be used if the developer/supplier demonstrates that each batch of the RhE model used meets defined production release criteria, among which those for viability (paragraph 19), barrier function (paragraph 20) and morphology (paragraph 21) are the most relevant. These data should be provided to the test method users, so that they are able to include this information in the test report. An acceptability range (upper and lower limit) for the IC50 or the ET50 should be established by the RhE model developer/supplier. Only results produced with qualified tissues can be accepted for reliable prediction of irritation classification. The acceptability ranges for the four test models included in this TM are given in Table 3.
Table 3
QC batch release criteria of the test models included in this TM
Lower acceptance limit
Upper acceptance limit
EpiSkin™ (SM)
(18 hours treatment with SDS) (32)
IC50 = 1,0 mg/ml
IC50 = 3,0 mg/ml
EpiDerm™ SIT (EPI-200)
(1 % Triton X-100) (33)
ET50 = 4,0 hr
ET50 = 8,7 hr
SkinEthic™ RHE
(1 % Triton X-100) (34)
ET50 = 4,0 hr
ET50 = 10,0 hr
LabCyte EPI-MODEL24 SIT
(18 hours treatment with SDS) (35)
IC50 = 1,4 mg/ml
IC50 = 4,0 mg/ml
Application of the Test Chemical and Control Chemicals
24.
At least three replicates should be used for each test chemical and for the controls in each run. For liquid as well as solid chemicals, sufficient amount of test chemical should be applied to uniformly cover the epidermis surface while avoiding an infinite dose, i.e. ranging from 26 to 83 l/cm2 or mg/cm2 (see Appendix 3). For solid chemicals, the epidermis surface should be moistened with deionised or distilled water before application, to improve contact between the test chemical and the epidermis surface. Whenever possible, solids should be tested as a fine powder. A nylon mesh may be used as a spreading aid in some cases (see Appendix 3). At the end of the exposure period, the test chemical should be carefully washed from the epidermis surface with aqueous buffer, or 0,9 % NaCl. Depending on the RhE test models used, the exposure period ranges between 15 and 60 minutes, and the incubation temperature between 20 and 37 °C. These exposure periods and temperatures are optimised for each individual RhE test method and represent the different intrinsic properties of the test models (e.g. barrier function) (see Appendix 3).
25.
Concurrent negative control (NC) and positive control (PC) should be used in each run to demonstrate that viability (using the NC), barrier function and resulting tissue sensitivity (using the PC) of the tissues are within a defined historical acceptance range. The suggested PC is 5 % aqueous SDS. The suggested NCs is either water or phosphate buffered saline (PBS).
Cell Viability Measurements
26.
According to the test procedure, it is essential that the viability measurement is not performed immediately after exposure to the test chemical, but after a sufficiently long post-treatment incubation period of the rinsed tissue in fresh medium. This period allows both for recovery from weak cytotoxic effects and for appearance of clear cytotoxic effects. A 42 hours post-treatment incubation period was found optimal during test optimisation of two of the RhE-based test models underlying this test method (11) (12) (13) (14) (15).
27.
The MTT assay is a standardised quantitative method which should be used to measure cell viability under this test method. It is compatible with use in a three-dimensional tissue construct. The tissue sample is placed in MTT solution of appropriate concentration (e.g. 0,3 - 1 mg/ml) for 3 hours. The MTT is converted into blue formazan by the viable cells. The precipitated blue formazan product is then extracted from the tissue using a solvent (e.g. isopropanol, acidic isopropanol), and the concentration of formazan is measured by determining the OD at 570 nm using a filter band pass of maximum ± 30 nm or, by using an HPLC/UPLC-spectrophotometry procedure (see paragraph 34) (36).
28.
Optical properties of the test chemical or its chemical action on MTT (e.g. chemicals may prevent or reverse the colour generation as well as cause it) may interfere with the assay leading to a false estimate of viability. This may occur when a specific test chemical is not completely removed from the tissue by rinsing or when it penetrates the epidermis. If a test chemical acts directly on the MTT (e.g. MTT-reducer), is naturally coloured, or becomes coloured during tissue treatment, additional controls should be used to detect and correct for test chemical interference with the viability measurement technique (see paragraphs 29 and 33). Detailed description of how to correct direct MTT reduction and interferences by colouring agents is available in the SOPs for the four validated models included in this test method (32) (33) (34) (35).
29.
To identify direct MTT reducers, each test chemical should be added to freshly prepared MTT solution. If the MTT mixture containing the test chemical turns blue/purple, the test chemical is presumed to directly reduce MTT and a further functional check on non-viable RhE tissues should be performed, independently of using the standard absorbance (OD) measurement or an HPLC/UPLC-spectrophotometry procedure. This additional functional check employs killed tissues that possess only residual metabolic activity but absorb the test chemical in a similar way as viable tissues. Each MTT reducing test chemical is applied on at least two killed tissue replicates which undergo the entire testing procedure to generate a non-specific MTT reduction (NSMTT) (32) (33) (34) (35). A single NSMTT control is sufficient per test chemical regardless of the number of independent tests/runs performed. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the MTT reducer minus the percent non-specific MTT reduction obtained with the killed tissues exposed to the same MTT reducer, calculated relative to the negative control run concurrently to the test being corrected (%NSMTT).
30.
To identify potential interference by coloured test chemicals or test chemicals that become coloured when in contact with water or isopropanol and decide on the need for additional controls, spectral analysis of the test chemical in water (environment during exposure) and/or isopropanol (extracting solution) should be performed. If the test chemical in water and/or isopropanol absorbs light in the range of 570 ± 30 nm, further colorant controls should be performed or, alternatively, an HPLC/UPLC-spectrophotometry procedure should be used in which case these controls are not required (see paragraphs 33 and 34). When performing the standard absorbance (OD) measurement, each interfering coloured test chemical is applied on at least two viable tissue replicates, which undergo the entire testing procedure but are incubated with medium instead of MTT solution during the MTT incubation step to generate a non-specific colour (NSCliving) control. The NSCliving control needs to be performed concurrently to the testing of the coloured test chemical and in case of multiple testing, an independent NSCliving control needs to be conducted with each test performed (in each run) due to the inherent biological variability of living tissues. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the interfering test chemical and incubated with MTT solution minus the percent non-specific colour obtained with living tissues exposed to the interfering test chemical and incubated with medium without MTT, run concurrently to the test being corrected (%NSCliving).
31.
Test chemicals that are identified as producing both direct MTT reduction (see paragraph 29) and colour interference (see paragraph 30) will also require a third set of controls, apart from the NSMTT and NSCliving controls described in the previous paragraphs, when performing the standard absorbance (OD) measurement. This is usually the case with darkly coloured test chemicals interfering with the MTT assay (e.g. blue, purple, black) because their intrinsic colour impedes the assessment of their capacity to directly reduce MTT as described in paragraph 29. These test chemicals may bind to both living and killed tissues and thereforethe NSMTT control may not only correct for potential direct MTT reduction by the test chemical, but also for colour interference arising from the binding of the test chemical to killed tissues. This could lead to a double correction for colour interference since the NSCliving control already corrects for colour interference arising from the binding of the test chemical to living tissues. To avoid a possible double correction for colour interference, a third control for non-specific colour in killed tissues (NSCkilled) needs to be performed. In this additional control, the test chemical is applied on at least two killed tissue replicates, which undergo the entire testing procedure but are incubated with medium instead of MTT solution during the MTT incubation step. A single NSCkilled control is sufficient per test chemical regardless of the number of independent tests/runs performed, but should be performed concurrently to the NSMTT control and, where possible, with the same tissue batch. The true tissue viability is then calculated as the percent tissue viability obtained with living tissues exposed to the test chemical minus %NSMTT minus %NSCliving plus the percent non-specific colour obtained with killed tissues exposed to the interfering test chemical and incubated with medium without MTT, calculated relative to the negative control run concurrently to the test being corrected (%NSCkilled).
32.
It is important to note that non-specific MTT reduction and non-specific colour interferences may increase the readouts of the tissue extract above the linearity range of the spectrophotometer. On this basis, each laboratory should determine the linearity range of their spectrophotometer with MTT formazan (CAS # 57360-69-7) from a commercial source before initiating the testing of test chemicals for regulatory purposes. The standard absorbance (OD) measurement using a spectrophotometer is appropriate to assess direct MTT-reducers and colour interfering test chemicals when the ODs of the tissue extracts obtained with the test chemical without any correction for direct MTT reduction and/or colour interference are within the linear range of the spectrophotometer or when the uncorrected percent viability obtained with the test chemical is already ≤ 50 %. Nevertheless, results for test chemicals producing %NSMTT and/or %NSCliving ≥ 50 % of the negative control should be taken with caution as this is the cut-off used to distinguish classified from not classified chemicals (see paragraph 36).
33.
For coloured test chemicals which are not compatible with the standard absorbance (OD) measurement due to too strong interference with the MTT assay, the alternative HPLC/UPLC-spectrophotometry procedure to measure MTT formazan may be employed (see paragraph 34) (36). The HPLC/UPLC-spectrophotometry system allows for the separation of the MTT formazan from the test chemical before its quantification (36). For this reason, NSCliving or NSCkilled controls are never required when using HPLC/UPLC-spectrophotometry, independently of the chemical being tested. NSMTT controls should nevertheless be used if the test chemical is suspected to directly reduce MTT or has a colour that impedes the assessment of the capacity to directly reduce MTT (as described in paragraph 29). When using HPLC/UPLC-spectrophotometry to measure MTT formazan, the percent tissue viability is calculated as percent MTT formazan peak area obtained with living tissues exposed to the test chemical relative to the MTT formazan peak obtained with the concurrent negative control. For test chemicals able to directly reduce MTT, true tissue viability is calculated as the percent tissue viability obtained with living tissues exposed to the test chemical minus %NSMTT. Finally, it should be noted that direct MTT-reducers that may also be colour interfering, which are retained in the tissues after treatment and reduce MTT so strongly that they lead to ODs (using standard OD measurement) or peak areas (using UPLC/HPLC-spectrophotometry) of the tested tissue extracts that fall outside of the linearity range of the spectrophotometer cannot be assessed, although these are expected to occur in only very rare situations.
34.
HPLC/UPLC-spectrophotometry may be used also with all types of test chemicals (coloured, non-coloured, MTT-reducers and non-MTT reducers) for measurement of MTT formazan (36). Due to the diversity of HPLC/UPLC-spectrophotometry systems, qualification of the HPLC/UPLC-spectrophotometry system should be demonstrated before its use to quantify MTT formazan from tissue extracts by meeting the acceptance criteria for a set of standard qualification parameters based on those described in the U.S. Food and Drug Administration guidance for industry on bio-analytical method validation (36) (37). These key parameters and their acceptance criteria are shown in Appendix 4. Once the acceptance criteria defined in Appendix 4 have been met, the HPLC/UPLC-spectrophotometry system is considered qualified and ready to measure MTT formazan under the experimental conditions described in this test method.
Acceptability Criteria
35.
For each test method using valid RhE model batches (see paragraph 23), tissues treated with the negative control should exhibit OD reflecting the quality of the tissues that followed shipment, receipt steps and all protocol processes. Control OD values should not be below historically established boundaries. Similarly, tissues treated with the PC, i.e. 5 % aqueous SDS, should reflect their ability to respond to an irritant chemical under the conditions of the test method (see Appendix 3 and for further information SOPs of the four test models included in this TG (32) (33) (34) (35)). Associated and appropriate measures of variability between tissue replicates, i.e. standard deviations (SD) should fall within the acceptance limits established for the test model used (see Appendix 3).
Interpretation of Results and Prediction Model
36.
The OD values obtained with each test chemical can be used to calculate the percentage of viability normalised to the negative control, which is set to 100 %. In case HPLC/UPLC-spectrophotometry is used, the percent tissue viability is calculated as percent MTT formazan peak area obtained with living tissues exposed to the test chemical relative to the MTT formazan peak obtained with the concurrent negative control. The cut-off value of percentage cell viability distinguishing irritant from non-classified test chemicals and the statistical procedure(s) used to evaluate the results and identify irritant chemicals should be clearly defined, documented, and proven to be appropriate (see SOPs of the test models for information). The cut-off values for the prediction of irritation are given below:
—
The test chemical is identified as requiring classification and labelling according to UN GHS/CLP (Category 2 or Category 1) if the mean percent tissue viability after exposure and post-treatment incubation is less than or equal (≤) to 50 %. Since the RhE test models covered by this test method cannot resolve between UN GHS/CLP Categories 1 and 2, further information on skin corrosion will be required to decide on its final classification [see also the OECD Guidance Document on IATA (3)]. In case the test chemical is found to be non-corrosive (e.g. based on TM.40, B.40bis or B.65), and shows tissue viability after exposure and post-treatment incubation is less than or equal (≤) to 50 %, the test chemical is considered to be irritant to skin in accordance with UN GHS/CLP Category 2.
—
Depending on the regulatory framework in member countries, the test chemical may be considered as non-irritant to skin in accordance with UN GHS/CLP No Category if the tissue viability after exposure and post-treatment incubation is more than (>) 50 %.
DATA AND REPORTING
Data
37.
For each run, data from individual replicate tissues (e.g. OD values and calculated percentage cell viability data for each test chemical, including classification) should be reported in tabular form, including data from repeat experiments as appropriate. In addition means ± SD for each run should be reported. Observed interactions with MTT reagent and coloured test chemicals should be reported for each tested chemical.
Test Report
38.
The test report should include the following information:
Test Chemical and Control Chemicals:
—
Mono-constituent substance: chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc;
—
Multi-constituent substance, UVCB and mixture: characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents;
—
Physical appearance, water solubility, and any additional relevant physicochemical properties;
—
Source, lot number if available;
—
Treatment of the test chemical/control chemicals prior to testing, if applicable (e.g. warming, grinding);
—
Stability of the test chemical, limit date for use, or date for re-analysis if known;
—
Storage conditions.
RhE model and protocol used (and rationale for the choice, if applicable)
Test Conditions:
—
RhE model used (including batch number);
—
Calibration information for measuring device (e.g. spectrophotometer), wavelength and band pass (if applicable) used for quantifying MTT formazan, and linearity range of measuring device; Description of the method used to quantify MTT formazan;
—
Description of the qualification of the HPLC/UPLC-spectrophotometry system, if applicable; Complete supporting information for the specific RhE model used including its performance. This should include, but is not limited to;
i)
Viability;
ii)
Barrier function;
iii)
Morphology;
iv)
Reproducibility and predictivity;
v)
Quality controls (QC) of the model;
—
Reference to historical data of the model. This should include, but is not limited to acceptability of the QC data with reference to historical batch data.
—
Demonstration of proficiency in performing the test method before routine use by testing of the proficiency substances.
Test Procedure:
—
Details of the test procedure used (including washing procedures used after exposure period); Dose of test chemical and controls used;
—
Duration and temperature of exposure and post-exposure incubation period;
—
Indication of controls used for direct MTT-reducers and/or colouring test chemicals, if applicable;
—
Number of tissue replicates used per test chemical and controls (PC, negative control, and NSMTT, NSCliving and NSCkilled, if applicable);
—
Description of decision criteria/prediction model applied based on the RhE model used;
—
Description of any modifications to the test procedure (including washing procedures).
Run and Test Acceptance Criteria:
—
Positive and negative control mean values and acceptance ranges based on historical data; Acceptable variability between tissue replicates for positive and negative controls;
—
Acceptable variability between tissue replicates for test chemical.
Results:
—
Tabulation of data for individual test chemical for each run and each replicate measurement including OD or MTT formazan peak area, percent tissue viability, mean percent tissue viability and SD;
—
If applicable, results of controls used for direct MTT-reducers and/or colouring test chemicals including OD or MTT formazan peak area, %NSMTT, %NSCliving, %NSCkilled, SD, final correct percent tissue viability;
—
Results obtained with the test chemical(s) and controls in relation to the defined run and test acceptance criteria;
—
Description of other effects observed;
—
The derived classification with reference to the prediction model/decision criteria used.
Discussion of the results
Conclusions
LITERATURE
(1)
United Nations (UN) (2013). Globally Harmonized System of Classification and Labelling of Chemicals (GHS), Second Revised Edition, UN New York and Geneva, 2013. Available at: http://www.unece.org/trans/danger/publi/ghs/ghs_rev05/05files_e.html.
(2)
EURL-ECVAM (2009). Statement on the “Performance Under UN GHS of Three In Vitro Assays for Skin Irritation Testing and the Adaptation of the Reference Chemicals and Defined Accuracy Values of the ECVAM Skin Irritation Performance Standards”, Issued by the ECVAM Scientific Advisory Committee (ESAC31), 9 April 2009. Available at: https://eurl-ecvam.jrc.ec.europa.eu/about-ecvam/archive-publications/publication//ESAC31_skin-irritation-statement_20090922.pdf
(3)
OECD (2014). Guidance Document on Integrated Approaches to Testing and Assessment for Skin Irritation/Corrosion. Environment, Health and Safety Publications, Series on Testing and Assessment (No 203), Organisation for Economic Cooperation and Development, Paris.
(4)
Chapter B.4 of this Annex, Acute Dermal Irritation, Corrosion.
(5)
Chapter B.40 of this Annex, In Vitro Skin Corrosion: Transcutaneous Electrical Resistance (TER).
(6)
Chapter B.40bis of this Annex, In Vitro Skin Corrosion: Reconstructed Human Epidermis (RHE) test method.
(7)
Chapter B.65 of this Annex, In Vitro Membrane Barrier Test Method.
(8)
OECD (2015). Performance Standards for the Assessment of Proposed Similar or Modified In Vitro Reconstructed Human Epidermis (RhE) Test Methods for Skin Irritation in Relation to TG 439. Environment, health and Safety Publications, Series on Testing and Assessment (No 220). Organisation for Economic Cooperation and Development, Paris.
(9)
OECD (2005). Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 34) Organisation for Economic Cooperation and Development, Paris.
(10)
Fentem, J.H., Briggs, D., Chesné, C., Elliot, G.R., Harbell, J.W., Heylings, J.R., Portes, P., Roguet, R., van de Sandt, J.J. M. and Botham, P. (2001). A Prevalidation Study on In Vitro Tests for Acute Skin Irritation, Results and Evaluation by the Management Team, Toxicol. in Vitro 15, 57-93.
(11)
Portes, P., Grandidier, M.-H., Cohen, C. and Roguet, R. (2002). Refinement of the EPISKIN Protocol for the Assessment of Acute Skin Irritation of Chemicals: Follow-Up to the ECVAM Prevalidation Study, Toxicol. in Vitro 16, 765–770.
(12)
Kandárová, H., Liebsch, M., Genschow, E., Gerner, I., Traue, D., Slawik, B. and Spielmann, H. (2004). Optimisation of the EpiDerm Test Protocol for the Upcoming ECVAM Validation Study on In Vitro Skin Irritation Tests, ALTEX 21, 107–114.
(13)
Kandárová, H., Liebsch, M., Gerner, I., Schmidt, E., Genschow, E., Traue, D. and Spielmann, H. (2005), The EpiDerm Test Protocol for the Upcoming ECVAM Validation Study on In Vitro Skin Irritation Tests – An Assessment of the Performance of the Optimised Test, ATLA 33, 351-367.
(14)
Cotovio, J., Grandidier, M.-H., Portes, P., Roguet, R. and Rubinsteen, G. (2005). The In Vitro Acute Skin Irritation of Chemicals: Optimisation of the EPISKIN Prediction Model Within the Framework of the ECVAM Validation Process, ATLA 33, 329-349.
(15)
Zuang, V., Balls, M., Botham, P.A., Coquette, A., Corsini, E., Curren, R.D., Elliot, G.R., Fentem, J.H., Heylings, J.R., Liebsch, M., Medina, J., Roguet, R., van De Sandt, J.J.M., Wiemann, C. and Worth, A. (2002). Follow-Up to the ECVAM Prevalidation Study on In Vitro Tests for Acute Skin Irritation, The European Centre for the Validation of Alternative Methods Skin Irritation Task Force report 2, ATLA 30, 109-129.
(16)
Spielmann, H., Hoffmann, S., Liebsch, M., Botham, P., Fentem, J., Eskes, C., Roguet, R., Cotovio, J., Cole, T., Worth, A., Heylings, J., Jones, P., Robles, C., Kandárová, H., Gamer, A., Remmele, M., Curren, R., Raabe, H., Cockshott, A., Gerner, I. and Zuang, V. (2007). The ECVAM International Validation Study on In Vitro Tests for Acute Skin Irritation: Report on the Validity of the EPISKIN and EpiDerm Assays and on the Skin Integrity Function Test, ATLA 35, 559-601.
(17)
Hoffmann S. (2006). ECVAM Skin Irritation Validation Study Phase II: Analysis of the Primary Endpoint MTT and the Secondary Endpoint IL1-α.
(18)
Eskes C., Cole, T., Hoffmann, S., Worth, A., Cockshott, A., Gerner, I. and Zuang, V. (2007). The ECVAM International Validation Study on In Vitro Tests for Acute Skin Irritation: Selection of Test Chemicals, ATLA 35, 603-619.
(19)
Cotovio, J., Grandidier, M.-H., Lelièvre, D., Roguet, R., Tinois-Tessonneaud, E. and Leclaire, J. (2007). In Vitro Acute Skin Irritancy of Chemicals Using the Validated EPISKIN Model in a Tiered Strategy - Results and Performances with 184 Cosmetic Ingredients, ALTEX, 14, 351-358.
(20)
EURL-ECVAM (2007). Statement on the Validity of In Vitro Tests for Skin Irritation, Issued by the ECVAM Scientific Advisory Committee (ESAC26), 27 April 2007. Available at: https://eurl-ecvam.jrc.ec.europa.eu/about-ecvam/archive-publications/publication//ESAC26_statement_SkinIrritation_20070525_C.pdf
(21)
EURL-ECVAM. (2007). Performance Standards for Applying Human Skin Models to In Vitro Skin Irritation Testing. N.B. These are the original PS used for the validation of two test methods. These PS should not be used any longer as an updated version (8) is now available.
(22)
EURL-ECVAM. (2008). Statement on the Scientific Validity of In Vitro Tests for Skin Irritation Testing, Issued by the ECVAM Scientific Advisory Committee (ESAC29), 5 November 2008. https://eurl-ecvam.jrc.ec.europa.eu/about-ecvam/archive-publications/publication/ESAC_Statement_SkinEthic-EpiDerm-FINAL-0812-01.pdf
(23)
OECD (2010). Explanatory Background Document to the OECD Draft Test Guideline on In Vitro Skin Irritation Testing. Environment, Health and Safety Publications. Series on Testing and Assessment, (No 137), Organisation for Economic Cooperation and Development, Paris.
(24)
Katoh, M., Hamajima, F., Ogasawara, T. and Hata K. (2009). Assessment of Human Epidermal Model LabCyte EPI-MODEL for In Vitro Skin Irritation Testing According to European Centre for the Validation of Alternative Methods (ECVAM)-Validated Protocol, J Toxicol Sci, 34, 327-334
(25)
Katoh, M. and Hata K. (2011). Refinement of LabCyte EPI-MODEL24 Skin Irritation Test Method for Adaptation to the Requirements of OECD Test Guideline 439, AATEX, 16, 111-122
(26)
OECD (2011). Validation Report for the Skin Irritation Test Method Using LabCyte EPI-MODEL24. Environment, Health and Safety Publications, Series on Testing and Assessment (No 159), Organisation for Economic Cooperation and Development, Paris.
(27)
OECD (2011). Peer Review Report of Validation of the Skin Irritation Test Using LabCyte EPI-MODEL24. Environment, Health and Safety Publications, Series on Testing and Assessment (No 155), Organisation for Economic Cooperation and Development, Paris.
(28)
Kojima, H., Ando, Y., Idehara, K., Katoh, M., Kosaka, T., Miyaoka, E., Shinoda, S., Suzuki, T., Yamaguchi, Y., Yoshimura, I., Yuasa, A., Watanabe, Y. and Omori, T. (2012). Validation Study of the In Vitro Skin Irritation Test with the LabCyte EPI-MODEL24, Altern Lab Anim, 40, 33-50.
(29)
Welss, T., Basketter, D.A. and Schröder, K.R. (2004). In Vitro Skin Irritation: Fact and Future. State of the Art Review of Mechanisms and Models, Toxicol. In Vitro 18, 231-243.
(30)
Eskes, C. et al. (2012). Regulatory Assessment of In Vitro Skin Corrosion and Irritation Data within the European Framework: Workshop Recommendations. Regul.Toxicol.Pharmacol. 62, 393-403).
(31)
Mosmann, T. (1983). Rapid Colorimetric Assay for Cellular Growth and Survival: Application to Proliferation and Cytotoxicity Assays, J. Immunol. Methods 65, 55-63.
(32)
EpiSkin™ (February 2009). SOP, Version 1,8ECVAM Skin Irritation Validation Study: Validation of the EpiSkin™ Test Method 15 min - 42 hours for the Prediction of acute Skin Irritation of Chemicals
(33)
EpiDerm™ (Revised March 2009). SOP, Version 7.0, Protocol for: In Vitro EpiDerm™ Skin Irritation Test (EPI-200-SIT), for Use with MatTek Corporation's Reconstructed Human Epidermal Model EpiDerm (EPI-200).
(34)
SkinEthic™ RHE (February 2009) SOP, Version 2.0, SkinEthic Skin Irritation Test-42bis Test Method for the Prediction of Acute Skin Irritation of Chemicals: 42 Minutes Application + 42 Hours Post-Incubation.
(35)
LabCyte (June 2011). EPI-MODEL24 SIT SOP, Version 8.3, Skin Irritation Test Using the Reconstructed Human Model “LabCyte EPI-MODEL24”
(36)
Alépée, N., Barroso, J., De Smedt, A., De Wever, B., Hibatallah, J., Klaric, M., Mewes, K.R., Millet, M., Pfannenbecker, U., Tailhardat, M., Templier, M., and McNamee, P. Use of HPLC/UPLC-Spectrophotometry for Detection of MTT Formazan in In Vitro Reconstructed Human Tissue (RhT)-Based Test Methods Employing the MTT Assay to Expand their Applicability to Strongly Coloured Test Chemicals. Manuscript in preparation.
(37)
US FDA (2001). Guidance for Industry: Bioanalytical Method Validation. U.S. Department of Health and Human Services, Food and Drug Administration. May 2001. Available at: [http://www.fda.gov/downloads/Drugs/Guidances/ucm070107.pdf].
(38)
Harvell, J.D., Lamminstausta, K., and Maibach, H.I. (1995). Irritant Contact Dermatitis, in: Practical Contact Dermatitis, pp 7-18, (Ed. Guin J. D.). Mc Graw-Hill, New York.
(39)
EURL-ECVAM (2009). Performance Standards for In Vitro Skin Irritation Test Methods Based on Reconstructed Human Epidermis (RhE). N.B. This is the current version of the ECVAM PS, updated in 2009 in view of the implementation of UN GHS. These PS should not be used any longer as an updated version (8) is now available related to the present TG.
(40)
EURL-ECVAM. (2009). ESAC Statement on the Performance Standards (PS) for In Vitro Skin Irritation Testing Using Reconstructed Human Epidermis, Issued by the ECVAM Scientific Advisory Committee (ESAC31), 8 July 2009.
(41)
EC (2001). Commission Directive 2001/59/EC of 6 August 2001 Adapting to Technical Progress for the 28th Time Council Directive 67/548/EEC on the Approximation of Laws, Regulations and Administrative Provisions Relating to the Classification, Packaging and Labelling of Dangerous Substances, Official Journal of the European Union L225, 1-333.
Appendix 1
DEFINITIONS
Accuracy: The closeness of agreement between test method results and accepted reference values. It is a measure of test method performance and one aspect of relevance. The term is often used interchangeably with “concordance” to mean the proportion of correct outcomes of a test method (9).
Cell viability: Parameter measuring total activity of a cell population e.g. as ability of cellular mitochondrial dehydrogenases to reduce the vital dye MTT (3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, Thiazolyl blue), which depending on the endpoint measured and the test design used, correlates with the total number and/or vitality of living cells.
Chemical: means a substance or a mixture.
Concordance: This is a measure of performance for test models that give a categorical result, and is one aspect of relevance. The term is sometimes used interchangeably with accuracy, and is defined as the proportion of all chemicals tested that are correctly classified as positive or negative. Concordance is highly dependent on the prevalence of positives in the types of test chemical being examined (9).
ET50: Can be estimated by determination of the exposure time required to reduce cell viability by 50 % upon application of the benchmark chemical at a specified, fixed concentration, see also IC50.
GHS (Globally Harmonized System of Classification and Labelling of Chemicals by the United Nations (UN)): A system proposing the classification of chemicals (substances and mixtures) according to standardised types and levels of physical, health and environmental hazards, and addressing corresponding communication elements, such as pictograms, signal words, hazard statements, precautionary statements and safety data sheets, so that to convey information on their adverse effects with a view to protect people (including employers, workers, transporters, consumers and emergency responders) and the environment (1).
HPLC: High Performance Liquid Chromatography.
IATA: Integrated Approach on Testing and Assessment
IC50: Can be estimated by determination of the concentration at which a benchmark chemical reduces the viability of the tissues by 50 % (IC50) after a fixed exposure time, see also ET50.
Infinite dose: Amount of test chemical applied to the epidermis exceeding the amount required to completely and uniformly cover the epidermis surface.
Mixture: A mixture or a solution composed of two or more substances.
Mono-constituent substance: A substance, defined by its quantitative composition, in which one main constituent is present to at least 80 % (w/w).
MTT: 3-(4,5-Dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide; Thiazolyl blue tetrazolium bromide.
Multi-constituent substance: A substance, defined by its quantitative composition, in which more than one main constituent is present in a concentration ≥ 10 % (w/w) and < 80 % (w/w). A multi-constituent substance is the result of a manufacturing process. The difference between mixture and multi-constituent substance is that a mixture is obtained by blending of two or more substances without chemical reaction. A multi-constituent substance is the result of a chemical reaction.
NSCkilled: Non-Specific Colour in killed tissues.
NSCliving: Non-Specific Colour in living tissues.
NSMTT: Non-Specific MTT reduction.
Performance standards (PS): Standards, based on a validated test method, that provide a basis for evaluating the comparability of a proposed test method that is mechanistically and functionally similar. Included are; (i) essential test method components; (ii) a minimum list of Reference Chemicals selected from among the chemicals used to demonstrate the acceptable performance of the validated test method; and (iii) the comparable levels of accuracy and reliability, based on what was obtained for the validated test method, that the proposed test method should demonstrate when evaluated using the minimum list of Reference Chemicals (9).
PC: Positive Control, a replicate containing all components of a test system and treated with a chemical known to induce a positive response. To ensure that variability in the positive control response across time can be assessed, the magnitude of the positive response should not be excessive.
Relevance: Description of relationship of the test to the effect of interest and whether it is meaningful and useful for a particular purpose. It is the extent to which the test correctly measures or predicts the biological effect of interest. Relevance incorporates consideration of the accuracy (concordance) of a test method (9).
Reliability: Measures of the extent that a test method can be performed reproducibly within and between laboratories over time, when performed using the same protocol. It is assessed by calculating intra- and inter-laboratory reproducibility (9).
Replacement test: A test which is designed to substitute for a test that is in routine use and accepted for hazard identification and/or risk assessment, and which has been determined to provide equivalent or improved protection of human or animal health or the environment, as applicable, compared to the accepted test, for all possible testing situations and chemicals (9).
Run: A run consists of one or more test chemicals tested concurrently with a negative control and with a PC.
Sensitivity: The proportion of all positive/active test chemicals that are correctly classified by the test. It is a measure of accuracy for a test method that produces categorical results, and is an important consideration in assessing the relevance of a test method (9).
Skin irritation in vivo: The production of reversible damage to the skin following the application of a test chemical for up to 4 hours. Skin irritation is a locally arising reaction of the affected skin tissue and appears shortly after stimulation (38). It is caused by a local inflammatory reaction involving the innate (non-specific) immune system of the skin tissue. Its main characteristic is its reversible process involving inflammatory reactions and most of the clinical characteristic signs of irritation (erythema, oedema, itching and pain) related to an inflammatory process.
Specificity: The proportion of all negative/inactive test chemicals that are correctly classified by the test. It is a measure of accuracy for a test method that produces categorical results and is an important consideration in assessing the relevance of a test method (9).
Substance: A chemical element and its compounds in the natural state or obtained by any production process, inducing any additive necessary to preserve its stability and any impurities deriving from the process used, but excluding any solvent which may be separated without affecting the stability of the substance or changing it composition.
Test chemical: Any substance or mixture tested using this test method.
UVCB: substances of unknown or variable composition, complex reaction products or biological materials.
Appendix 2
TEST MODELS INCLUDED IN THIS TEST METHOD
Nr.
Test model name
Validation study type
References
1
EpiSkin™
Full prospective validation study (2003-2007). The components of this model were used to define the essential test method components of the original and updated ECVAM PS (39) (40) (21) (*3). Moreover, the method's data relating to identification of non-classified vs classified substances formed the main basis for defining the specificity and sensitivity values of the original PS (*3).
EpiDerm™ (original): Initially the test model underwent full prospective validation together with Nr. 1. from 2003-2007. The components of this model were used to define the essential test methods components of the original and updated ECVAM PS (39) (40) (21) (*3). EpiDerm™ SIT (EPI-200): A modification of the original EpiDerm™ was validated using the original ECVAM PS (21) in 2008 (*3)
Validation study based on the original ECVAM Performance Standards (21) in 2008 (*3).
(2) (21) (22) (23) (31)
4
LabCyte EPI-MODEL24 SIT
Validation study (2011-2012) based on the Performance Standards (PS) of OECD TG 439 (8) which are based on the updated ECVAM PS (*3) (39) (40).
(24) (25) (26) (27) (28) (35) (39) (40) and PS of this TG (8) (*3)
SIT: Skin Irritation Test
RHE: Reconstructed Human Epidermis
Appendix 3
PROTOCOL PARAMETERS SPECIFIC TO EACH OF THE TEST MODELS INCLUDED IN THIS TEST METHOD
The RhE models do show very similar protocols and notably all use a post-incubation period of 42 hours (32) (33) (34) (35). Variations concern mainly three parameters relating to the different barrier functions of the test models and listed here: A) pre-incubation time and volume, B) Application of test chemicals and C) Post-incubation volume.
EpiSkinTM (SM)
EpiDermTM SIT (EPI-200)
SkinEthic RHETM
LabCyte EPI-MODEL24 SIT
A) Pre-incubation
Incubation time
18-24 hours
18-24 hours
< 2 hours
15-30 hours
Medium volume
2ml
0,9ml
0,3 or 1ml
0,5ml
B) Test chemical application
For liquids
10μl (26μl/cm2)
30μl (47μl/cm2)
16μl (32μl/cm2)
25μl (83μl/cm2)
For solids
10mg (26mg/cm2)+ DW (5μl)
25mg (39mg/cm2)+ DPBS (25μl)
16mg (32mg/cm2)+ DW (10μl)
25mg (83mg/cm2)+ DW (25μl)
Use of nylon mesh
Not used
If necessary
Applied
Not used
Total application time
15 minutes
60 minutes
42 minutes
15 minutes
Application temperature
RT
a) at RT for 25 minutes
b) at 37oC for 35 minutes
RT
RT
C) Post-incubation volume
Medium volume
2 ml
0,9ml x 2
2 ml
1 ml
D) Maximum acceptable variability
Standard deviation between tissue replicates
SD≤18
SD≤18
SD≤18
SD≤18
RT: Room temperature
DW: distilled water
DPBS: Dulbecco’s Phosphate Buffer Saline
Appendix 4
KEY PARAMETERS AND ACCEPTANCE CRITERIA FOR QUALIFICATION OF AN HPLC/UPLC-SPECTROPHOTOMETRY SYSTEM FOR MEASUREMENT OF MTT FORMAZAN EXTRACTED FROM RHE TISSUES
Parameter
Protocol Derived from FDA Guidance (36) (37)
Acceptance Criteria
Selectivity
Analysis of isopropanol, living blank (isopropanol extract from living RhE tissues without any treatment), dead blank (isopropanol extract from killed RhE tissues without any treatment)
Quality Controls (i.e. MTT formazan at 1,6 μg/ml, 16 μg/ml and 160 μg/ml) in isopropanol (n=5)
CV ≤ 15 % or ≤ 20 % for the LLOQ
Accuracy
Quality Controls in isopropanol (n=5)
%Dev ≤ 15 % or ≤ 20 % for LLOQ
Matrix Effect
Quality Controls in living blank (n=5)
85 % ≤ Matrix Effect % ≤ 115 %
Carryover
Analysis of isopropanol after an ULOQ (20) standard
Areainterference ≤ 20 % of AreaLLOQ
Reproducibility (intra-day)
3 independent calibration curves (based on 6 consecutive 1/3 dilutions of MTT formazan in isopropanol starting at ULOQ, i.e. 200 μg/ml);
Quality Controls in isopropanol (n=5)
Calibration Curves: %Dev ≤ 15 % or ≤ 20 % for LLOQ
Quality Controls: %Dev ≤ 15 % and CV ≤ 15 %
Reproducibility (inter-day)
Day 1
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Day 2
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Day 3
:
1 calibration curve and Quality Controls in isopropanol (n=3)
Short Term Stability of MTT Formazan in RhE Tissue Extract
Quality Controls in living blank (n=3) analysed the day of the preparation and after 24 hours of storage at room temperature
%Dev ≤ 15 %
Long Term Stability of MTT Formazan in RhE Tissue Extract, if required
Quality Controls in living blank (n=3) analysed the day of the preparation and after several days of storage at a specified temperature (e.g. 4oC, -20oC, -80oC)
%Dev ≤ 15 %
"
(8)
In Part B, the following Chapters are added:
"B.63 REPRODUCTION/DEVELOPMENTAL TOXICITY SCREENING TEST
INTRODUCTION
1.
This test method is equivalent to OECD test guideline (TG) 421 (2016). OECD guidelines for the testing of chemicals are periodically reviewed in the light of scientific progress. The original screening test guideline 421 was adopted in 1995, based on a protocol for a "Preliminary Reproduction Toxicity Screening Test" discussed in two expert meetings, in London in 1990 (1) and in Tokyo in 1992 (2).
2.
This test method has been updated with endocrine disruptor relevant endpoints, as a follow up to the high-priority activity initiated at OECD in 1998 to revise existing test guidelines and to develop new test guidelines for the screening and testing of potential endocrine disruptors (3). OECD TG 407 (Repeated Dose 28-Day Oral Toxicity Study in Rodents, Chapter B.7 of this Annex) for example, was enhanced in 2008 by parameters suitable to detect endocrine activity of test chemicals. The objective in updating TG 421 was to include some endocrine disruptor relevant endpoints in screening TGs where the exposure periods cover some of the sensitive periods during development (pre- or early postnatal periods).
3.
The selected additional endocrine disrupter relevant endpoints, also part of TG 443 (Extended One Generation Reproductive Toxicity Study, Chapter B.56 of this Annex), were included in TG 421 based on a feasibility study addressing scientific and technical questions related to their inclusion, as well as possible adaptations of the test design needed for their inclusion (4).
4.
This test method is designed to generate limited information concerning the effects of a test chemical on male and female reproductive performance such as gonadal function, mating behaviour, conception, development of the conceptus and parturition. It is not an alternative to, nor does it replace the existing test methods B.31, B.34, B.35 or B.56.
INITIAL CONSIDERATIONS
5.
This screening test method can be used to provide initial information on possible effects on reproduction and/or development, either at an early stage of assessing the toxicological properties of chemicals, or on chemicals of concern. It can also be used as part of a set of initial screening tests for existing chemicals for which little or no toxicological information is available, as a dose range finding study for more extensive reproduction/developmental studies, or when otherwise considered relevant. In conducting the study, the guiding principles and considerations outlined in the OECD guidance document no 19 on the recognition, assessment, and use of clinical signs as humane endpoints for experimental animals used in safety evaluations (5) should be followed.
6.
This test method does not provide complete information on all aspects of reproduction and development. In particular, it offers only limited means of detecting post-natal manifestations of pre-natal exposure, or effects that may be induced during post-natal exposure. Due (amongst other reasons) to the relatively small numbers of animals in the dose groups, the selectivity of the end points, and the short duration of the study, this method will not provide evidence for definite claims of no effects. Moreover, in the absence of data from other reproduction/developmental toxicity tests, positive results are useful for initial hazard assessment and contribute to decisions with respect to the necessity and timing of additional testing.
7.
The results obtained by the endocrine related parameters should be seen in the context of the "OECD Conceptual Framework for Testing and Assessment of Endocrine Disrupting Chemicals" (6). In this Conceptual Framework, the enhanced OECD TG 421 is contained in level 4 as an in vivo assay providing data on adverse effects on endocrine relevant endpoints. An endocrine signal might not however be considered sufficient evidence on its own that the test chemical is an endocrine disruptor.
8.
This test method assumes oral administration of the test chemical. Modifications may be required if other routes of exposure are used.
9.
Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture.
10.
Definitions used are given in Appendix 1.
PRINCIPLE OF THE TEST
11.
The test chemical is administered in graduated doses to several groups of males and females. Males should be dosed for a minimum of four weeks and up to and including the day before scheduled kill (this includes a minimum of two weeks prior to mating, during the mating period and, approximately, two weeks post-mating). In view of the limited pre-mating dosing period in males, fertility may not be a particular sensitive indicator of testicular toxicity. Therefore, a detailed histological examination of the testes is essential. The combination of a pre-mating dosing period of two weeks and subsequent mating/fertility observations with an overall dosing period of at least four weeks, followed by detailed histopathology of the male gonads, is considered sufficient to enable detection of the majority of effects on male fertility and spermatogenesis.
12.
Females should be dosed throughout the study. This includes two weeks prior to mating (with the objective of covering at least two complete oestrous cycles), the variable time to conception, the duration of pregnancy and at least thirteen days after delivery, up to and including the day before scheduled kill.
13.
Duration of study, following acclimatisation and pre-dosing oestrous cycle evaluation, is dependent on the female performance and is approximately 63 days, [at least 14 days premating, (up to) 14 days mating, 22 days gestation, 13 days lactation].
14.
During the period of administration, the animals are observed closely each day for signs of toxicity. Animals which die or are killed during the test period are necropsied and, at the conclusion of the test, surviving animals are killed and necropsied.
DESCRIPTION OF THE METHOD
Selection of animal species
15.
This test method is designed for use with the rat. If the parameters specified within this test method are investigated in another rodent species a detailed justification should be given. In the international validation program for the detection of endocrine disrupters in OECD TG 407 (corresponding to Chapter B.7 of this Annex), the rat was the only species used. Strains with low fecundity or well-known high incidence of developmental defects should not be used. Healthy virgin animals, not subjected to previous experimental procedures, should be used. The test animals should be characterised as to species, strain, sex, weight and age. At the commencement of the study the weight variation of animals used should be minimal and not exceed 20 % of the mean weight of each sex. Where the study is conducted as a preliminary study to a long-term or a full-generation study, it is preferable that animals from the same strain and source are used in both studies.
Housing and feeding
16.
All procedures should conform to local standards of laboratory animal care. The temperature in the experimental animal room should be 22 °C (± 3 °). Although the relative humidity should be at least 30 % and preferably not exceed 70 % other than during room cleaning, the aim should be 50-60 %. Lighting should be artificial, the photoperiod being 12 hours light, 12 hours dark. For feeding, conventional laboratory diets may be used with an unlimited supply of drinking water. The choice of diet may be influenced by the need to ensure a suitable admixture of a test chemical when administered by this method.
17.
Animals should be group housed in small groups of the same sex; animals may be housed individually if scientifically justified. For group caging, no more than five animals should be housed per cage. Mating procedures should be carried out in cages suitable for the purpose. Pregnant females should be caged individually and provided with nesting materials. Lactating females will be caged individually with their offspring.
18.
The feed should be regularly analysed for contaminants. A sample of the diet should be retained until finalisation of the report.
Preparation of the animals
19.
Healthy young adult animals are randomly assigned to the control and treatment groups. Cages should be arranged in such a way that possible effects due to cage placement are minimised. The animals are uniquely identified and kept in their cages for at least five days prior to the start of the study to allow for acclimatisation to the laboratory conditions.
Preparation of doses
20.
It is recommended that the test chemical be administered orally unless other routes of administration are considered more appropriate. When the oral route is selected, the test chemical is usually administered by gavage; however, alternatively, test chemicals may be administered via the diet or drinking water.
21.
Where necessary, the test chemical is dissolved or suspended in a suitable vehicle. It is recommended that, wherever possible, the use of an aqueous solution/suspension be considered first, followed by consideration of a solution/emulsion in oil (e.g. corn oil) and then by possible solution in other vehicles. For vehicles other than water the toxic characteristics of the vehicle should be known. The stability and homogeneity of the test chemical in the vehicle should be determined.
PROCEDURE
Number and sex of animals
22.
It is recommended that each group be started with at least 10 males and 12-13 females. Females will be evaluated pre-exposure for oestrous cyclicity and animals that fail to exhibit typical 4-5 day cycles will not be included in the study; therefore, extra females are recommended in order to yield 10 females per group. Except in the case of marked toxic effects, it is expected that this will provide at least 8 pregnant females per group which normally is the minimum acceptable number of pregnant females per group. The objective is to produce enough pregnancies and offspring to assure a meaningful evaluation of the potential of the test chemical to affect fertility, pregnancy, maternal and suckling behaviour, and growth and development of the F1 offspring from conception to day 13 post-partum.
Dosage
23.
Generally, at least three test groups and a control group should be used. Dose levels may be based on information from acute toxicity tests or on results from repeated dose studies. Except for treatment with the test chemical, animals in the control group should be handled in an identical manner to the test group subjects. If a vehicle is used in administering the test chemical, the control group should receive the vehicle in the highest volume used.
24.
Dose levels should be selected taking into account any existing toxicity and (toxico-) kinetic data available. It should also be taken into account that there may be differences in sensitivity between pregnant and non-pregnant animals. The highest dose level should be chosen with the aim of inducing toxic effects but not death or severe suffering. Thereafter, a descending sequence of dose levels should be selected with a view to demonstrating any dosage related response and no-observed-adverse effects (NOAEL) at the lowest dose level. Two to four fold intervals are frequently optimal for setting the descending dose levels and addition of a fourth test group is often preferable to using very large intervals (e.g. more than a factor of 10) between dosages.
25.
In the presence of observed general toxicity (e.g. reduced body weight, liver, heart, lung or kidney effects, etc.) or other changes that may not be toxic responses (e.g. reduced food intake, liver enlargement), observed effects on endocrine sensitive endpoints should be interpreted with caution.
Limit test
26.
If an oral study at one dose level of at least 1 000 mg/kg body weight/day or, for dietary or drinking water administration, an equivalent percentage in the diet or drinking water, using the procedures described for this study, produces no observable toxic effects and if toxicity would not be expected based upon data from structurally related substances, then a full study using several dose levels may not be considered necessary. The limit test applies except when human exposure indicates the need for a higher oral dose level to be used. For other types of administration, such as inhalation or dermal application, the physical chemical properties of the test chemicals often may dictate the maximum attainable concentration.
Administration of doses
27.
The animals are dosed with the test chemical daily for 7 days a week. When the test chemical is administered by gavage, this should be done in a single dose to the animals using a stomach tube or a suitable intubation cannula. The maximum volume of liquid that can be administered at one time depends on the size of the test animal. The volume should not exceed 1 ml/100 g body weight, except in the case of aqueous solutions where 2 ml/100 g body weight may be used. Except for irritating or corrosive test chemicals which will normally reveal exacerbated effects with higher concentrations, variability in test volume should be minimised by adjusting the concentration to ensure a constant volume at all dose levels.
28.
For test chemical administered via the diet or drinking water, it is important to ensure that the quantities of the test chemical involved do not interfere with normal nutrition or water balance. When the test chemical is administered in the diet either a constant dietary concentration (ppm) or a constant dose level in terms of the animals' body weight may be used; the alternative used should be specified. For a test chemical administered by gavage, the dose should be given at similar times each day, and adjusted at least weekly to maintain a constant dose level in terms of animal body weight.
Experimental schedule
29.
Dosing of both sexes should begin at least 2 weeks prior to mating, after they have been acclimatised for at least five days and females have been screened for normal oestrous cycles (in a 2 weeks pre-treatment period). The study should be scheduled in such a way that oestrous cycle evaluation begins soon after the animals have attained full sexual maturity. This may vary slightly for different strains of rats in different laboratories, e.g. Sprague Dawley rats 10 weeks of age, Wistar rats about 12 weeks of age. Dams with offspring should be killed on day 13 post-partum, or shortly thereafter. The day of birth (viz. when parturition is complete) is defined as day 0 post-partum. Females showing no-evidence of copulation are killed 24-26 days after the last day of the mating period. Dosing is continued in both sexes during the mating period. Males should further be dosed after the mating period at least until the minimum total dosing period of 28 days has been completed. They are then killed, or, alternatively, are retained and continued to be dosed for the possible conduction of a second mating if considered appropriate.
30.
Daily dosing of the parental females should continue throughout pregnancy and at least up to, and including, day 13 post-partum or the day before sacrifice. For studies where the test chemical is administered by inhalation or by the dermal route, dosing should be continued at least up to, and including, day 19 of gestation, and dosing should be re-initiated as soon as possible and not later than PND 4.
31.
A diagram of the experimental schedule is given in Appendix 2.
Mating procedure
32.
Normally, 1:1 (one male to one female) matings should be used in this study. Exceptions can arise in the case of occasional deaths of males. The female should be placed with the same male until evidence of copulation is observed or two weeks have elapsed. Each morning the females should be examined for the presence of sperm or a vaginal plug. Day 0 of pregnancy is defined as the day on which mating evidence is confirmed (a vaginal plug or sperm is found). In case pairing is unsuccessful, re-mating of females with proven males of the same group could be considered.
Litter size
33.
On day 4 after birth, the size of each litter may be adjusted by eliminating extra pups by random selection to yield, as nearly as possible, four or five pups per sex per litter depending on the normal litter size in the strain of rats used. Blood samples should be collected from two of the surplus pups, pooled, and used for determination of serum T4 levels. Selective elimination of pups, e.g. based upon body weight, or anogenital distance (AGD) is not appropriate. Whenever the number of male or female pups prevents having four or five of each sex per litter, partial adjustment (for example, six males and four females) is acceptable. No pups will be eliminated when litter size will drop below the culling target (8 or 10 pups/litter). If there is only one pup available above the culling target, only one pup will be eliminated and used for blood collection for possible serum T4 assessments.
34.
If litter size is not adjusted, two pups per litter are sacrificed on day 4 after birth and blood samples are taken for measurement of serum thyroid hormone concentrations. If possible the two pups per litter should be female pups to reserve male pups for nipple retention evaluations except in the event that removing these pups leaves no remaining females for assessment at termination. No pups will be eliminated when litter size will drop below 8 or 10 pups/litter (depending on the normal litter size in the strain of rats used). If there is only one pup available above the normal litter size, only one pup will be eliminated and used for blood collection for possible serum T4 assessments.
In life observations
Clinical observations
35.
Throughout the test period, general clinical observations should be made at least once a day, and more frequently when signs of toxicity are observed. They should be made preferably at the same time(s) each day, considering the peak period of anticipated effects after dosing. Pertinent behavioural changes, signs of difficult or prolonged parturition and all signs of toxicity, including mortality, should be recorded. These records should include time of onset, degree and duration of toxicity signs.
Body weight and food/water consumption
36.
Males and females should be weighed on the first day of dosing, at least weekly thereafter, and at termination. During pregnancy, females should be weighed on days 0, 7, 14 and 20 and within 24 hours of parturition (day 0 or 1 post-partum) and at least day 4 and 13 post-partum. These observations should be reported individually for each adult animal.
37.
During pre-mating, pregnancy and lactation, food consumption should be measured at least weekly. The measurement of food consumption during mating is optional. Water consumption during these periods should also be measured when the test chemical is administered via drinking water.
Oestrous cycles
38.
Oestrous cycles should be monitored before treatment starts to select for the study females with regular cyclicity (see paragraph 22). Vaginal smears should also be monitored daily from the beginning of the treatment period until evidence of mating. If there is concern about acute stress effects that could alter oestrous cycles with the initiation of dosing, laboratories can expose test animals for 2 weeks, then collect vaginal smears daily to monitor oestrous cycle for a minimum of two weeks during the pre-mating period with continued monitoring into the mating period until there is evidence of mating. When obtaining vaginal/cervical cells, care should be taken to avoid disturbance of mucosa, which could induce pseudopregnancy (7) (8).
Offspring parameters
39.
The duration of gestation should be recorded and is calculated from day 0 of pregnancy. Each litter should be examined as soon as possible after delivery to establish the number and sex of pups, stillbirths, live births, runts (pups that are significantly smaller than corresponding control pups) and the presence of gross abnormalities.
40.
Live pups should be counted and sexed and litters weighed within 24 hours of parturition (day 0 or 1 post-partum) and at least on day 4 and 13 post-partum. In addition to the observations described in paragraph 35, any abnormal behaviour of the offspring should be recorded.
41.
The AGD of each pup should be measured on the same postnatal day between PND 0 through PND 4. Pup body weight should be collected on the day the AGD is measured and the AGD should be normalised to a measure of pup size, preferably the cube root of body weight (9). The number of nipples/areolae in male pups should be counted on PND 12 or 13 as recommended in OECD GD 151 (10).
Clinical biochemistry
42.
Blood samples from a defined site are taken based on the following schedule:
—
from at least two pups per litter on day 4 after birth, if the number of pups allows (see paragraphs 33-34)
—
from all dams and at least two pups per litter at termination on day 13, and
—
from all adult males, at termination,
All blood samples are stored under appropriate conditions. Blood samples from the day 13 pups and the adult males are assessed for serum levels for thyroid hormones (T4). Further assessment of T4 in blood samples from the dams and day 4 pups is done if relevant. As an option other hormones may be measured if relevant. Pup blood can be pooled by litter for thyroid hormone analyses. Thyroid hormones (T4 and TSH) should preferably be measured as ‘total’.
43.
The following factors may influence the variability and the absolute concentrations of the hormone determinations:
—
time of sacrifice because of diurnal variation of hormone concentrations
—
method of sacrifice to avoid undue stress to the animals that may affect hormone concentrations
—
test kits for hormone determinations that may differ by their standard curves.
44.
Plasma samples specifically intended for hormone determination should be obtained at a comparable time of the day. The numerical values obtained when analysing hormone concentrations differ with various commercial assay kits.
Pathology
Gross necropsy
45.
At the time of sacrifice or death during the study, the adult animals should be examined macroscopically for any abnormalities or pathological changes. Special attention should be paid to the organs of the reproductive system. The number of implantation sites should be recorded. Vaginal smears should be examined in the morning on the day of necropsy to determine the stage of the oestrous cycle and allow correlation with histopathology of ovaries.
46.
The testes and epididymides as well as prostate and seminal vesicles with coagulating glands as a whole, of all male adult animals should be trimmed of any adherent tissue, as appropriate, and their wet weight taken as soon as possible after dissection to avoid drying. In addition, optional organ weights could include levator ani plus bulbocavernosus muscle complex, Cowper’s glands and glans penis in males and paired ovaries (wet weight) and uterus (including cervix) in females; if included, these weights should be collected as soon as possible after dissection.
47.
Dead pups and pups killed at day 13 post-partum, or shortly thereafter, should, at least, be carefully examined externally for gross abnormalities. Particular attention should be paid to the external reproductive genitals which should be examined for signs of altered development. At day 13 the thyroid from 1 male and 1 female pup per litter should be preserved.
48.
The ovaries, testes, accessory sex organs (uterus and cervix, epididymides, prostate, seminal vesicles plus coagulating glands), thyroid and all organs showing macroscopic lesions of all adult animals should be preserved. Formalin fixation is not recommended for routine examination of testes and epididymides. An acceptable method is the use of Bouin's fixative or modified Davidsons for these tissues (11). The tunica albuginea may be gently and shallowly punctured at the both poles of the organ with a needle to permit rapid penetration of the fixative.
Histopathology
49.
Detailed histological examination should be performed on the ovaries, testes and epididymides (with special emphasis on stages of spermatogenesis and histopathology of interstitial testicular cell structure) of the animals of the highest dose group and the control group. The other preserved organs including thyroid from pups and adult animals may be examined when necessary. The thyroid weight could be determined after fixation. Trimming should also be done very carefully and only after fixation to avoid tissue damage. Tissue damage could compromise histopathology analysis. Examinations should be extended to the animals of other dosage groups when changes are seen in the highest dose group. The Guidance on histopathology (11) details extra information on dissection, fixation, sectioning and histopathology of endocrine tissues.
DATA AND REPORTING
Data
50.
Individual animal data should be provided. Additionally, all data should be summarised in tabular form, showing for each test group the number of animals at the start of the test, the number of animals found dead during the test or killed for humane reasons, the time of any death or humane kill, the number of fertile animals, the number of pregnant females, the number of animals showing signs of toxicity, a description of the signs of toxicity observed, including time of onset, duration, and severity of any toxic effects, the types of histopathological changes, and all relevant litter data. A tabular summary report format that has proven to be very useful for the evaluation of reproductive/developmental effect is given in Appendix 3.
51.
Due to the limited dimensions of the study, statistical analyses in the form of tests for "significance" are of limited value for many endpoints, especially reproductive endpoints. If statistical analyses are used then the method chosen should be appropriate for the distribution of the variable examined, and be selected prior to the start of the study. Statistical analysis of AGD and nipple retention should be based on individual pup data, taking litter effects into account. Where appropriate, the litter is the unit of analysis. Statistical analysis of pup body weight should be based on individual pup data, taking litter size into account. Because of the small group size, the use of historic control data (e.g. for litter size), where available, may also be useful as an aid to the interpretation of the study.
Evaluation of results
52.
The findings of this toxicity study should be evaluated in terms of the observed effects, necropsy and microscopic findings. The evaluation will include the relationship between the dose of the test chemical and the presence or absence, incidence and severity of abnormalities, including gross lesions, identified target organs, infertility, clinical abnormalities, affected reproductive and litter performance, body weight changes, effects on mortality and any other toxic effects.
53.
Because of the short period of treatment of the male, the histopathology of the testes and epididymides should be considered along with the fertility data, when assessing male reproductive effects. The use of historical control data on reproduction/development (e.g., for litter size, AGD, nipple retention, serum T4 levels), where available, may also be useful as an aid to the interpretation of the study.
54.
For quality control it is proposed that historical control data are collected and that for numerical data coefficients of variation are calculated, especially for the parameters linked with endocrine disrupter detection. These data can be used for comparison purposes when actual studies are evaluated.
Test report
55.
The test report should include the following information:
Test chemical:
—
source, lot number, limit date for use, if available
—
stability of the test chemical, if known.
Mono-constituent substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents.
Vehicle (if appropriate):
—
justification for choice of vehicle if other than water.
Test animals:
—
species/strain used;
—
number, age and sex of animals;
—
source, housing conditions, diet, etc.;
—
individual weights of animals at the start of the test.
—
justification for species if not rat
Test conditions:
—
rationale for dose level selection;
—
details of test chemical formulation/diet preparation, achieved concentrations, stability and homogeneity of the preparation;
—
details of the administration of the test chemical;
—
conversion from diet/drinking water test chemical concentration (ppm) to the actual dose (mg/kg body weight/day), if applicable;
—
details of food and water quality;
—
detailed description of the randomisation procedure to select pups for culling, if culled.
Results:
—
body weight/body weight changes;
—
food consumption, and water consumption if available;
—
toxic response data by sex and dose, including fertility, gestation, and any other signs of toxicity;
—
gestation length;
—
toxic or other effects on reproduction, offspring, post-natal growth, etc.;
—
nature, severity and duration of clinical observations (whether reversible or not);
—
number of adult females with normal or abnormal oestrous cycle and cycle duration;
—
number of live births and post-implantation loss;
—
pup body weight data
—
AGD of all pups (and body weight on day of AGD measurement)
—
nipple retention in male pups,
—
thyroid hormone levels, day 13 pups and adult males (and dams and day 4 pups if measured)
—
number pups with grossly visible abnormalities, gross evaluation of external genitalia, number of runts;
—
time of death during the study or whether animals survived to termination;
—
number of implantations, litter size and litter weights at the time of recording;
—
body weight at sacrifice and organ weight data for the parental animals;
—
necropsy findings;
—
detailed description of histopathological findings;
—
absorption data (if available);
—
statistical treatment of results, where appropriate.
Discussion of results.
Conclusions.
Interpretation of results
56.
The study will provide evaluations of reproduction/developmental toxicity associated with administration of repeated doses (see paragraphs 5 and 6). It could provide an indication of the need to conduct further investigations and provides guidance in the design of subsequent studies. OECD Guidance Document 43 should be consulted for aid in the interpretation of reproduction and developmental results (12). OECD Guidance Document No 106 on Histologic Evaluation of Endocrine and Reproductive Tests in Rodents (11) provides information on the preparation and evaluation of (endocrine) organs and vaginal smears that may be helpful for this TG.
LITERATURE
(1)
OECD (1990). Room Document No 1 for the 14th Joint Meeting of the Chemicals Group and Management Committee. Available upon request at Organisation for Economic and Cooperation and Development, Paris.
(2)
OECD (1992). Chairman's Report of the ad hoc Expert Meeting on Reproductive Toxicity Screening Methods, Tokyo, 27th-29th October, 1992. Available Upon Request at Organisation for Economic Cooperation and Development, Paris.
(3)
OECD (1998). Report of the First Meeting of the OECD Endocrine Disrupter Testing and Assessment (EDTA) Task Force, 10th-11th March 1998. Available Upon Request at Organisation for Economic Cooperation and Development, Paris.
(4)
OECD (2015). Feasibility Study for Minor Enhancements of TG 421/422 with ED Relevant Endpoints. Environment, Health and Safety Publications, Series on Testing and Assessment (No 217), Organisation for Economic Cooperation and Development, Paris.
(5)
OECD (2000). Guidance Document on the Recognition, Assessment, and Use of Clinical Signs as Humane Endpoints for Experimental Animals Used in Safety Evaluations. Series on Testing and Assessment, (No 19), Organisation for Economic Cooperation and Development,.Paris.
(6)
OECD (2011). Guidance Document on Standardised Test Guidelines for Evaluating Chemicals for Endocrine Disruption. Environment, Health and Safety Publications, Series on Testing and Assessment(No 150), Organisation for Economic Cooperation and Development, Paris.
(7)
Goldman, J.M., Murr A.S., Buckalew A.R., Ferrell J.M. and Cooper R.L. (2007). The Rodent Estrous Cycle: Characterization of Vaginal Cytology and its Utility in Toxicological Studies, Birth Defects Research, Part B, 80 (2), 84-97.
(8)
Sadleir R.M.F.S (1979). Cycles and Seasons, in Auston C.R. and Short R.V. (eds.), Reproduction in Mammals: I. Germ Cells and Fertilization, Cambridge, New York.
(9)
Gallavan R.H. Jr, Holson J.F., Stump D.G., Knapp J.F. and Reynolds V.L. (1999). Interpreting the Toxicologic Significance of Alterations in Anogenital Distance: Potential for Confounding Effects of Progeny Body Weights, Reproductive Toxicology, 13: 383-390.
(10)
OECD (2013). Guidance Document in Support of the Test Guideline on the Extended One Generation Reproductive Toxicity Study. Environment, Health and Safety Publications, Series on Testing and Assessment (No 151), Organisation for Economic Cooperation and Development, Paris.
(11)
OECD (2009). Guidance Document for Histologic Evaluation of Endocrine and Reproductive Tests in Rodents. Environment, Health and Safety Publications, Series on Testing and Assessment (No106), Organisation for Economic Cooperation and Development, Paris.
(12)
OECD (2008). Guidance Document on Mammalian Reproductive Toxicity Testing and Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 43), Organisation for Economic Cooperation and Development, Paris.
Appendix 1
DEFINITIONS (SEE ALSO OECD GD 150 (6))
Androgenicity is the capability of a chemical to act like a natural androgenic hormone (e.g. testosterone) in a mammalian organism.
Antiandrogenicity is the capability of a chemical to suppress the action of a natural androgenic hormone (e.g. testosterone) in a mammalian organism.
Antioestrogenicity is the capability of a chemical to suppress the action of a natural oestrogenic hormone (e.g. oestradiol 17ß) in a mammalian organism.
Antithyroid activity is the capability of a chemical to suppress the action of a natural thyroid hormone (e.g. T3) in a mammalian organism.
Chemical is a substance or a mixture.
Developmental toxicity: the manifestation of reproductive toxicity, representing pre-, peri- post-natal, structural, or functional disorders in the progeny.
Dosage is a general term comprising of dose, its frequency and the duration of dosing.
Dose is the amount of test chemical administered. The dose is expressed as weight of test chemical per unit body weight of test animal per day (e.g. mg/kg body weight/day), or as a constant dietary concentration.
Evident toxicity is a general term describing clear signs of toxicity following administration of test chemical. These should be sufficient for hazard assessment and should be such that an increase in the dose administered can be expected to result in the development of severe toxic signs and probable mortality.
Impairment of fertility represents disorders of male or female reproductive functions or capacity.
Maternal toxicity: adverse effects on gravid females, occurring either specifically (direct effect) or not specifically (indirect effect).
NOAEL is the abbreviation for no-observed-adverse effect level. This is the highest dose level where no adverse treatment-related findings are observed due to treatment.
Oestrogenicity is the capability of a chemical to act like a natural oestrogenic hormone (e.g. oestradiol 17ß) in a mammalian organism.
Reproduction toxicity represents harmful effects on the progeny and/or an impairment of male and female reproductive functions or capacity.
Test chemical is any substance or mixture tested using this test method.
Thyroid activity is the capability of a chemical to act like a natural thyroid hormone (e.g. T3) in a mammalian organism.
Validation is a scientific process designed to characterise the operational requirements and limitations of a test method and to demonstrate its reliability and relevance for a particular purpose.
Appendix 2
DIAGRAM OF THE EXPERIMENTAL SCHEDULE INDICATING THE MAXIMUM STUDY DURATION, BASED ON A FULL 14-DAY MATING PERIOD
Appendix 3
TABULAR SUMMARY REPORT OF EFFECTS ON REPRODUCTION/DEVELOPMENT
OBSERVATIONS
VALUES
Dosage (units)
0 (control)
…
…
…
…
Pairs started (N)
Oestrus cycle (at least mean length and frequency of irregular cycles)
Pup weight at the time of AGD measurement (mean males, mean females)
Pup AGD on the same postnatal day, birth – day 4 (mean males, mean females, note PND)
Pup weight at day 4 (mean)
Male pup nipple retention at day 13 (mean)
Pup weight at day 13 (mean)
ABNORMAL PUPS
Dams with 0
Dams with 1
Dams with 2
LOSS OF OFFSPRING
Pre-natal/post-implantations (implantations minus live births)
Females with 0
Females with 1
Females with 2
Females with 3
Post-natal (live births minus alive at post-natal day 13)
Females with 0
Females with 1
Females with 2
Females with 3
B.64 COMBINED REPEATED DOSE TOXICITY STUDY WITH THE REPRODUCTION/DEVELOPMENTAL TOXICITY SCREENING TEST
INTRODUCTION
1.
This test method is equivalent to OECD test guideline (TG) 422 (2016). OECD guidelines for the Testing of Chemicals are periodically reviewed in the light of scientific progress. The original screening test guideline 422 was adopted in 1996, based on a protocol for a "Combined Repeat Dose and Reproductive/Developmental Screening Test" discussed in two expert meetings, in London in 1990 (1) and in Tokyo in 1992 (2).
2.
This test method combines a reproduction/developmental toxicity screening part which is based on experience gained in Member countries from using the original method on existing high production volume chemicals and in exploratory tests with positive control substances (3) (4), and a repeated dose toxicity part, in concordance with OECD test guideline 407 (Repeated Dose 28-Day Oral Toxicity Study in Rodents, corresponding to Chapter B.7 of this Annex).
3.
This test method has been updated with endocrine disruptor relevant endpoints, as a follow up to the high-priority activity initiated at OECD in 1998 to revise existing test guidelines and to develop new test guidelines for the screening and testing of potential endocrine disruptors (5). In this context TG 407 (corresponding to Chapter B.7 of this Annex) was enhanced in 2008 by parameters suitable to detect endocrine activity of test chemicals. The objective in updating TG 422 was to include some endocrine disruptor relevant endpoints in screening TGs where the exposure periods cover some of the sensitive periods during development (pre- or early postnatal periods).
4.
The selected additional endocrine disrupter relevant endpoints, also part of TG 443 (Extended One Generation Reproductive Toxicity Study, corresponding to Chapter B.56 of this Annex), were included in TG 422 based on a feasibility study addressing scientific and technical questions related to their inclusion, as well as possible adaptations of the test design needed for their inclusion (6).
5.
This test method is designed to generate limited information concerning the effects of a test chemical on male and female reproductive performance such as gonadal function, mating behaviour, conception, development of the conceptus and parturition. It is not an alternative to, nor does it replace the existing test methods B.31, B.34, B.35 or B.56.
INITIAL CONSIDERATIONS
6.
In the assessment and evaluation of the toxic characteristics of a test chemical the determination of oral toxicity using repeated doses may be carried out after the initial information on toxicity has been obtained by acute testing. This study provides information on the possible health hazards likely to arise from repeated exposure over a relatively limited period of time. The method comprises the basic repeated dose toxicity study that may be used for chemicals on which a 90-day study is not warranted (e.g. when the production volume does not exceed certain limits) or as a preliminary study to a long-term study. In conducting the study, the guiding principles and considerations outlined in the OECD guidance document no 19 on the recognition, assessment, and use of clinical signs as humane endpoints for experimental animals used in safety evaluations (7) should be followed.
7.
It further comprises a reproduction/developmental toxicity screening test and, therefore, can also be used to provide initial information on possible effects on male and female reproductive performance such as gonadal function, mating behaviour, conception, development of the conceptus and parturition, either at an early stage of assessing the toxicological properties of test chemicals, or on test chemicals of concern. This test method does not provide complete information on all aspects of reproduction and development. In particular, it offers only limited means of detecting postnatal manifestations of prenatal exposure, or effects that may be induced during postnatal exposure. Due (amongst other reasons) to the selectivity of the end points, and the short duration of the study, this method will not provide evidence for definite claims of no reproduction/developmental effects. Moreover, in the absence of data from other reproduction/developmental toxicity tests, positive results are useful for initial hazard assessment and contribute to decisions with respect to the necessity and timing of additional testing.
8.
The results obtained by the endocrine related parameters should be seen in the context of the “OECD Conceptual Framework for Testing and Assessment of Endocrine Disrupting Chemicals” (8). In this Conceptual Framework, the enhanced OECD TG 422 is contained in level 4 as an in vivo assay providing data on adverse effects on endocrine relevant endpoints. An endocrine signal might not however be considered sufficient evidence on its own that the test chemical is an endocrine disruptor.
9.
The test method also places emphasis on neurological effects as a specific endpoint, and the need for careful clinical observations of the animals, so as to obtain as much information as possible, is stressed. The method should identify chemicals with neurotoxic potential, and which may warrant further in-depth investigation of this aspect. In addition, the method may also give a basic indication of immunological effects.
10.
In the absence of data from other systemic toxicity, reproduction/developmental toxicity, neurotoxicity and/or immunotoxicity studies, positive results are useful for initial hazard assessment and contribute to decisions with respect to the necessity and timing of additional testing. The test may be particularly useful as part of the OECD Screening Information Data Set (SIDS) for the assessment of existing chemicals for which little or no toxicological information is available and can serve as an alternative to conducting two separate tests for repeated dose toxicity (OCD TG 407, corresponding to Chapter B.7 of this Annex) and reproduction/developmental toxicity (OECD TG 421, corresponding to Chapter B.63 of this Annex), respectively. It can also be used as a dose range finding study for more extensive reproduction/developmental studies, or when otherwise considered relevant.
11.
Generally, it is assumed that there are differences in sensitivity between pregnant and non-pregnant animals. Consequently, it may be more complicated to determine dose levels in this combined test that are adequate to evaluate both general systemic toxicity and specific reproduction/developmental toxicity, rather than when the individual tests are conducted separately. Moreover, interpretation of the test results with respect to general systemic toxicity may be more difficult than when conducting a separate repeated-dose study, especially when serum and histopathology parameters are not evaluated at the same time in the study. Because of these technical complexities, considerable experience in toxicity testing is required for the performance of this combined screening test. On the other hand, apart from the smaller number of animals involved, the combined test may offer a better means of discriminating direct effects on reproduction/development from those that are secondary to other (systemic) effects.
12.
In this test, the dosing period is longer than in a conventional 28-day repeated dose study. However, it uses fewer animals of each sex per group when compared with the situation where a conventional 28-day repeated dose study is conducted in addition to a Reproduction/Developmental Toxicity Screening Test.
13.
This test method assumes oral administration of the test chemical. Modifications may be required if other routes of exposure are used.
14.
Before use of the test method on a mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture.
15.
Definitions used are given in Appendix 1.
PRINCIPLE OF THE TEST
16.
The test chemical is administered in graduated doses to several groups of males and females. Males should be dosed for a minimum of four weeks, up to and including the day before scheduled kill (this includes a minimum of two weeks prior to mating, during the mating period and, approximately, two weeks post mating). In view of the limited pre-mating dosing period in males, fertility may not be a particularly sensitive indicator of testicular toxicity. Therefore, a detailed histological examination of the testes is essential. The combination of a pre-matingdosing period of two weeks and subsequent mating/fertility observations with an overall dosing period of at least four weeks, followed by detailed histopathology of the male gonads, is considered sufficient to enable detection of the majority of effects on male fertility and spermatogenesis.
17.
Females should be dosed throughout the study. This includes two weeks prior to mating (with the objective of covering at least two complete oestrous cycles), the variable time to conception, the duration of pregnancy and at least thirteen days after delivery, up to and including the day before scheduled kill.
18.
Duration of study, following acclimatisation and pre-dosing oestrous cycle evaluation, is dependent on the female performance and is approximately 63 days, [at least 14 days pre-mating, (up to) 14 days mating, 22 days gestation, 13 days lactation].
19.
During the period of administration, the animals are observed closely each day for signs of toxicity. Animals which die or are killed during the test are necropsied and, at the conclusion of the test, surviving animals are killed and necropsied.
DESCRIPTION OF THE METHOD
Selection of animal species
20.
This test method is designed for use with the rat. If the parameters specified within this TG 422 are investigated in another rodent species a detailed justification should be given. In the international validation program for the detection of endocrine disrupters on TG 407, the rat was the only species used. Strains with low fecundity or well-known high incidence of developmental defects should not be used. Healthy virgin animals, not subjected to previous experimental procedures, should be used. The test animals should be characterised as to species, strain, sex, weight and age. At the commencement of the study the weight variation of animals used should be minimal and not exceed ± 20 % of the mean weight of each sex. Where the study is conducted as a preliminary study to a long-term or a full-generation study, it is preferable that animals from the same strain and source are used in both studies.
Housing and feeding
21.
All procedures should conform to local standards of laboratory animal care. The temperature in the experimental animal room should be 22 °C (± 3 °). The relative humidity should be at least 30 % and preferably not exceed 70 % other than during room cleaning. Lighting should be artificial, the photoperiod being 12 hours light, 12 hours dark. For feeding, conventional laboratory diets may be used with an unlimited supply of drinking water. The choice of diet may be influenced by the need to ensure a suitable admixture of a test chemical when administered by this method.
22.
Animals should be group housed in small groups of the same sex; animals may be housed individually if scientifically justified. For group caging, no more than five animals should be housed per cage. Mating procedures should be carried out in cages suitable for the purpose. Pregnant females should be caged individually and provided with nesting materials. Lactating females will be caged individually with their offspring.
23.
The feed should be regularly analysed for contaminants. A sample of the diet should be retained until finalisation of the report.
Preparation of the animals
24.
Healthy young adult animals are randomised and assigned to the treatment groups and cages. Cages should be arranged in such a way that possible effects due to cage placements are minimised. The animals are uniquely identified and kept in their cages for at least five days prior to the start of the study to allow for acclimatisation to the laboratory conditions.
Preparation of doses
25.
It is recommended that the test chemical be administered orally unless other routes of administration are considered more appropriate. When the oral route is selected, the test chemical is usually administered by gavage; however, alternatively, test chemicals may also be administered via the diet or drinking water.
26.
Where necessary, the test chemical is dissolved or suspended in a suitable vehicle. It is recommended that, wherever possible, the use of an aqueous solution/suspension be considered first, followed by consideration of a solution/suspension in oil (e.g. corn oil) and then by possible solution in other vehicles. For non-aqueous vehicles the toxic characteristics of the vehicle should be known. The stability and homogeneity of the test chemical in the vehicle should be determined.
PROCEDURE
Number and sex of animals
27.
It is recommended that each group be started with at least 10 males and 12-13 females. Females will be evaluated pre-exposure for oestrous cyclicity and animals that fail to exhibit typical 4-5 day cycles will not be included in the study; therefore, extra females are recommended in order to yield 10 females per group. Except in the case of marked toxic effects, it is expected that this will provide at least 8 pregnant females per group which normally is the minimum acceptable number of pregnant females per group. The objective is to produce enough pregnancies and offspring to assure a meaningful evaluation of the potential of the test chemical to affect fertility, pregnancy, maternal and suckling behaviour, and growth and development of the F1 offspring from conception to day 13 post-partum. If interim kills are planned, the number should be increased by the number of animals scheduled to be killed before the completion of the study. Consideration should be given to an additional satellite group of five animals per sex in the control and the top dose group for observation of reversibility, persistence or delayed occurrence of systemic toxic effects, for at least 14 days post treatment. Animals of the satellite groups will not be mated and, consequently, are not used for the assessment of reproduction/developmental toxicity.
Dosage
28.
Generally, at least three test groups and a control group should be used. If there are no suitable general toxicity data available, a range finding study may (animals of the same strain and source) be performed to aid the determination of the doses to be used. Except for treatment with the test chemical, animals in the control group should be handled in an identical manner to the test group subjects. If a vehicle is used in administering the test chemical, the control group should receive the vehicle in the highest volume used.
29.
Dose levels should be selected taking into account any existing toxicity and (toxico-) kinetic data available. It should also be taken into account that there may be differences in sensitivity between pregnant and non-pregnant animals. The highest dose level should be chosen with the aim of inducing toxic effects but not death nor obvious suffering. Thereafter, a descending sequence of dose levels should be selected with a view to demonstrating any dosage related response and no adverse effects at the lowest dose level. Two- to four- fold intervals are frequently optimum and addition of a fourth test group is often preferable to using very large intervals (e.g. more than a factor of 10) between dosages.
30.
In the presence of observed general toxicity (e.g. reduced body weight, liver, heart, lung or kidney effects, etc.) or other changes that may not be toxic responses (e.g. reduced food intake, liver enlargement), observed effects on endocrine sensitive endpoints should be interpreted with caution.
Limit test
31.
If an oral study at one dose level of at least 1 000 mg/kg body weight/day or, for dietary administration, an equivalent percentage in the diet, or drinking water (based upon body weight determinations), using the procedures described for this study, produces no observable toxic effects and if toxicity would not be expected based upon data from structurally related substances, then a full study using several dose levels may not be considered necessary. The limit test applies except when human exposure indicates the need for a higher dose level to be used. For other types of administration, such as inhalation or dermal application, the physical chemical properties of the test chemicals often may dictate the maximum attainable exposure.
Administration of doses
32.
The animals are dosed with the test chemical daily for 7 days a week. When the test chemical is administered by gavage, this should be done in a single dose to the animals using a stomach tube or a suitable intubation cannula. The maximum volume of liquid that can be administered at one time depends on the size of the test animal. The volume should not exceed 1 ml/100 g body weight, except in the case of aqueous solutions where 2 ml/100 gbody weight may be used. Except for irritating or corrosive test chemicals which will normally reveal exacerbated effects with higher concentrations, variability in test volume should be minimised by adjusting the concentration to ensure a constant volume at all dose levels.
33.
For test chemicals administered via the diet or drinking water, it is important to ensure that the quantities of the test chemical involved do not interfere with normal nutrition or water balance. When the test chemical is administered in the diet either a constant dietary concentration (ppm) or a constant dose level in terms of the animals’ body weight may be used; the alternative used should be specified. For a test chemical administered by gavage, the dose should be given at similar times each day, and adjusted at least weekly to maintain a constant dose level in terms of animal body weight. Where the combined study is used as a preliminary to a long term or a full reproduction toxicity study, a similar diet should be used in both studies.
Experimental schedule
34.
Dosing of both sexes should begin 2 weeks prior to mating, after they have been acclimatised for at least five days and females have been screened for normal oestrous cycles (in a 2 weeks pre-treatment period). The study should be scheduled in such a way that oestrous cycle evaluation begins soon after the animals have attained full sexual maturity. This may vary slightly for different strains of rats in different laboratories, e.g. Sprague Dawley rats 10 weeks of age, Wistar rats about 12 weeks of age. Dams with offspring should be killed on day 13 post-partum, or shortly thereafter. In order to allow for overnight fasting of dams prior to blood collection (if this option is preferred), dams and their offspring need not necessarily be killed on the same day. The day of birth (viz. when parturition is complete) is defined as day 0 post-partum. Females showing no-evidence of copulation are killed 24-26 days after the last day of the mating period. Dosing is continued in both sexes during the mating period. Males should further be dosed after the mating period at least until the minimum total dosing period of 28 days has been completed. They are then killed, or, alternatively, are retained and continued to be dosed for the possible conduction of a second mating if considered appropriate.
35.
Daily dosing of the parental females should continue throughout pregnancy and at least up to, and including, day 13 post-partum or the day before sacrifice. For studies where the test chemical is administered by inhalation or by the dermal route, dosing should be continued at least up to, and including, day 19 of gestation, and dosing should be re-initiated as soon as possible and not later than postnatal day (PND) 4.
36.
Animals in a satellite group scheduled for follow-up observations, if included, are not mated. They should be kept at least for a further 14 days after the first scheduled kill of dams, without treatment to detect delayed occurrence, or persistence of, or recovery from toxic effects.
37.
A diagram of the experimental schedule is given in Appendix 2.
Oestrous cycles
38.
Oestrous cycles should be monitored before treatment starts to select for the study females with regular cyclicity (see paragraph 27). Vaginal smears should also be monitored daily from the beginning of the treatment period until evidence of mating. If there is concern about acute stress effects that could alter estrous cycles with the initiation of dosing, laboratories can expose test animals for 2 weeks, then collect vaginal smears daily to monitor estrous cycle for a minimum of two weeks during the pre-mating period with continued monitoring into the mating period until there is evidence of mating. When obtaining vaginal/cervical cells, care should be taken to avoid disturbance of mucosa, which could induce pseudopregnancy (8) (9).
Mating procedure
39.
Normally, 1:1 (one male to one female) matings should be used in this study. Exceptions can arise in the case of occasional deaths of males. The female should be placed with the same male until evidence of copulation is observed or two weeks have elapsed. Each morning the females should be examined for the presence of sperm or a vaginal plug. Day 0 of pregnancy is defined as the day on which mating evidence is confirmed (a vaginal plug or sperm is found). In case pairing was unsuccessful, re-mating of females with proven males of the same group could be considered.
Litter size
40.
On day 4 after birth, the size of each litter may be adjusted by eliminating extra pups by random selection to yield, as nearly as possible, four or five pups per sex per litter depending on the normal litter size in the strain of rats used. Blood samples should be collected from two of the surplus pups, pooled, and used for determination of serum T4 levels Selective elimination of pups, e.g. based upon body weight, or anogenital distance (AGD) is not appropriate. Whenever the number of male or female pups prevents having four or five of each sex per litter, partial adjustment (for example, six males and four females) is acceptable. No pups will be eliminated when litter size will drop below the culling target (8 or 10 pups/litter). If there is only one pup available above the culling target, only one pup will be eliminated and used for blood collection for possible serum T4 assessments.
41.
If litter size is not adjusted, two pups per litter are sacrificed on day 4 after birth and blood samples are taken for measurement of serum thyroid hormone concentrations. If possible the two pups per litter should be female pups to reserve male pups for nipple retention evaluations, except in the event that removing these pups leaves no remaining females for assessment at termination. No pups will be eliminated when litter size will drop below 8 or 10 pups/litter (depending on the normal litter size in the strain of rats used). If there is only one pup available above the normal litter size, only one pup will be eliminated and used for blood collection for possible serum T4 assessments.
Observations
42.
General clinical observations should be made at least once a day, preferably at the same time(s) each day and considering the peak period of anticipated effects after dosing. The health condition of the animals should be recorded. At least twice daily all animals are observed for morbidity and mortality.
43.
Once before the first exposure (to allow for within-subject comparisons), and at least once a week thereafter, detailed clinical observations should be made in all parental animals. These observations should be made outside the home cage in a standard arena and preferably at the same time, each day. They should be carefully recorded; preferably using scoring systems, explicitly defined by the testing laboratory. Effort should be made to ensure that variations in the test conditions are minimal and that observations are preferably conducted by observers unaware of the treatment. Signs noted should include, but not be limited to, changes in skin, fur, eyes, mucous membranes, occurrence of secretions and excretions and autonomic activity (e.g. lacrimation, piloerection, pupil size, unusual respiratory pattern). Changes in gait, posture and response to handling as well as the presence of clonic or tonic movements, stereotypies (e.g. excessive grooming, repetitive circling), difficult or prolonged parturition or bizarre behaviour (e.g. self-mutilation, walking backwards) should also be recorded (10).
44.
At one time during the study, sensory reactivity to stimuli of different modalities (e.g. auditory, visual and proprioceptive stimuli) (8) (9) (11), assessment of grip strength (12) and motor activity assessment (13) should be conducted in five males and five females, randomly selected from each group. Further details of the procedures that could be followed are given in the respective references. However, alternative procedures than those referenced could also be used. In males, these functional observations should be made towards the end of their dosing period, shortly before scheduled kill but before blood sampling for haematology or clinical chemistry (see paragraphs 53-56, including footnote 1). Females should be in a physiologically similar state during these functional tests and should preferably be tested once during the last week of lactation (e.g., LD 6-13), shortly before scheduled kill. To the extent possible, minimise dams and pups separation times.
45.
Functional observations made once towards the end of the study may be omitted when the study is conducted as a preliminary study to a subsequent subchronic (90-day) or long-term study. In that case, the functional observations should be included in this follow-up study. On the other hand, the availability of data on functional observations from this repeated dose study may enhance the ability to select dose levels for a subsequent subchronic or long-term study.
46.
As an exception, functional observations may also be omitted for groups that otherwise reveal signs of toxicity to an extent that would significantly interfere with the functional test performance.
47.
The duration of gestation should be recorded and is calculated from day 0 of pregnancy. Each litter should be examined as soon as possible after delivery to establish the number and sex of pups, stillbirths, live births, runts (pups that are significantly smaller than corresponding control pups), and the presence of gross abnormalities.
48.
Live pups should be counted and sexed and litters weighed within 24 hours of parturition (day 0 or 1 post-partum) and at least on day 4 and day 13 post-partum. In addition to the observations on parent animals (see paragraphs 43 and 44), any abnormal behaviour of the offspring should be recorded.
49.
The AGD of each pup should be measured on the same postnatal day between PND 0 through PND 4. Pup body weight should be collected on the day the AGD is measured and the AGD should be normalised to a measure of pup size, preferably the cube root of body weight (14). The number of nipples/areolae in male pups should be counted on PND 12 or 13 as recommended in OECD GD 151 (15).
Body weight and food/water consumption
50.
Males and females should be weighed on the first day of dosing, at least weekly thereafter, and at termination. During pregnancy, females should be weighed on days 0, 7, 14 and 20 and within 24 hours of parturition (day 0 or 1 post-partum), and at least day 4 and day 13 post-partum. These observations should be reported individually for each adult animal.
51.
During pre-mating, pregnancy and lactation, food consumption should be measured at least weekly. The measurement of food consumption during mating is optional. Water consumption during these periods should also be measured, when the test chemical is administered by that medium.
Haematology
52.
Once during the study, the following haematological examinations should be made in five males and five females randomly selected from each group: haematocrit, haemoglobin concentrations, erythrocyte count, reticulocytes, total and differential leucocyte count, platelet count and a measure of blood clotting time/potential. Other determinations that should be carried out, if the test chemical or its putative metabolites have or are suspected to have oxidising properties include methaemoglobin concentration and Heinz bodies.
53.
Blood samples should be taken from a named site. Females should be in a physiologically similar state during sampling. In order to avoid practical difficulties related to the variability in the onset of gestation, blood collection in females may be done at the end of the pre-mating period as an alternative to sampling just prior to, or as part of, the procedure for euthanasia of the animals. Blood samples of males should preferably be taken just prior to, or as part of, the procedure for euthanasia of the animals. Alternatively, blood collection in males may also be done at the end of the pre-mating period when this time point was preferred for females.
54.
Blood samples should be stored under appropriate conditions.
Clinical biochemistry
55.
Clinical biochemistry determinations to investigate major toxic effects in tissues and, specifically, effects on kidney and liver, should be performed on blood samples obtained from the selected five males and five females of each group. Overnight fasting of the animals prior to blood sampling is recommended (22). Investigations of plasma or serum should include sodium, potassium, glucose, total cholesterol, urea, creatinine, total protein and albumin, at least two enzymes indicative of hepatocellular effects (such as alanin aminotransferase, aspartate aminotransferase and sorbitol dehydrogenase) and bile acids. Measurements of additional enzymes (of hepatic or other origin) and bilirubin may provide useful information under certain circumstances.
56.
Blood samples from a defined site are taken based on the following schedule:
—
from at least two pups per litter on day 4 after birth, if the number of pups allows (see paragraphs 40-41)
—
from all dams and at least two pups per litter at termination on day 13, and
—
from all adult males, at termination
All blood samples are stored under appropriate conditions. Blood samples from the day 13 pups and the adult males are assessed for serum levels for thyroid hormones (T4). Further assessment of T4 in blood samples from the dams and day 4 pups is done if relevant. As an option, other hormones may be measured if relevant. Pup blood can be pooled by litter for thyroid hormone analyses. Thyroid hormones (T4 and TSH) should preferably be measured as ‘total’.
57.
Optionally, the following urinalysis determinations could be performed in five randomly selected males of each group during the last week of the study using timed urine volume collection; appearance, volume, osmolality or specific gravity, pH, protein, glucose and blood/blood cells.
58.
In addition, studies to investigate serum markers of general tissue damage should be considered. Other determinations that should be carried out if the known properties of the test chemical may, or are suspected to, affect related metabolic profiles include calcium, phosphate, fasting triglycerides and fasting glucose, specific hormones, methaemoglobin and cholinesterase. These need to be identified on a case-by-case basis.
59.
The following factors may influence the variability and the absolute concentrations of the hormone determinations:
—
time of sacrifice because of diurnal variation of hormone concentrations
—
method of sacrifice to avoid undue stress to the animals that may affect hormone concentrations
—
test kits for hormone determinations that may differ by their standard curves.
60.
Plasma samples specifically intended for hormone determination should be obtained at a comparable time of the day. The numerical values obtained when analysing hormone concentrations differ with various commercial assay kits.
61.
If historical baseline data are inadequate, consideration should be given to determination of haematological and clinical biochemistry variables before dosing commences or preferably in a set of animals not included in the experimental groups. For females, the data have to be from lactating animals.
PATHOLOGY
Gross necropsy
62.
All adult animals in the study should be subjected to a full, detailed gross necropsy which includes careful examination of the external surface of the body, all orifices, and the cranial, thoracic and abdominal cavities and their contents. Special attention should be paid to the organs of the reproductive system. The number of implantation sites should be recorded. Vaginal smears should be examined on the day of necropsy to determine the stage of the oestrous cycle and allow correlation with histopathology of female reproductive organs.
63.
The testes and epididymides as well as prostate and seminal vesicles with coagulating glands as a whole of all male adult animals should be trimmed of any adherent tissue, as appropriate, and their wet weight taken as soon as possible after dissection to avoid drying. In addition, optional organ weights could include levator ani plus bulbocavernosus muscle complex, Cowper’s glands and glans penis in males and paired ovaries (wet weight) and uterus (including cervix) in females; if included, these weights should be collected as soon as possible after dissection. The ovaries, testes, epididymides, accessory sex organs, and all organs showing macroscopic lesions of all adult animals, should be preserved.
64.
From all adult males and females and one male and female day 13 pup from each litter thyroid glands should be preserved in the most appropriate fixation medium for the intended subsequent histopathological examination. The thyroid weight could be determined after fixation. Trimming should also be done very carefully and only after fixation to avoid tissue damage. Tissue damage could compromise histopathology analysis. Blood samples should be taken from a named site just prior to or as part of the procedure for euthanasia of the animals, and stored under appropriate conditions (see paragraph 56).
65.
In addition, for a least five adult males and females, randomly selected from each group (apart from those found moribund and/or euthanised prior to the termination of the study), the liver, kidneys, adrenals, thymus, spleen, brain and heart should be trimmed of any adherent tissue, as appropriate and their wet weight taken as soon as possible after dissection to avoid drying. The following tissues should be preserved in the most appropriate fixation medium for both the type of tissue and the intended subsequent histopathological examination: all gross lesions, brain (representative regions including cerebrum, cerebellum and pons), spinal cord, eye, stomach, small and large intestines (including Peyer's patches), liver, kidneys, adrenals, spleen, heart, thymus, trachea and lungs (preserved by inflation with fixative and then immersion), gonads (testis and ovaries), accessory sex organs (uterus andcervix, epididymides, prostate, seminal vesicles plus coagulating glands), vagina, urinary bladder, lymph nodes (besides the most proximal draining node, another lymph node should be taken according to the laboratory’s experience (16)), peripheral nerve (sciatic or tibial) preferably in close proximity to the muscle, skeletal muscle and bone, with bone marrow (section or, alternatively, a fresh mounted bone marrow aspirate). It is recommended that testes be fixed by immersion in Bouin’s or modified Davidson’s fixative (16) (17) (18); formalin fixation is not recommended for these tissues. The tunica albuginea may be gently and shallowly punctured at the both poles of the organ with a needle to permit rapid penetration of the fixative. The clinical and other findings may suggest the need to examine additional tissues. Also any organs considered likely to be target organs based on the known properties of the test chemical should be preserved.
66.
The following tissues may give valuable indication for endocrine-related effects: Gonads (ovaries and testes), accessory sex organs (uterus including cervix, epididymides, seminal vesicles with coagulation glands, dorsolateral and ventral prostate), vagina, pituitary, male mammary gland and adrenal gland. Changes in male mammary glands have not been sufficiently documented but this parameter may be very sensitive to substances with estrogenic action. Observation of organs/tissues that are not listed in paragraph 65 is optional.
67.
Dead pups and pups killed at day 13 post-partum, or shortly thereafter, should, at least, be carefully examined externally for gross abnormalities. Particular attention should be paid to the external reproductive genitals which should be examined for signs of altered development.
Histopathology
68.
Full histopathology should be carried out on the preserved organs and tissues of the selected animals in the control and high dose groups (with special emphasis on stages of spermatogenesis in the male gonads and histopathology of interstitial testicular cell structure). The thyroid gland from pups and from the remaining adult animals may be examined when necessary. These examinations should be extended to animals of other dosage groups, if treatment-related changes are observed in the high dose group. The Guidance on histopathology (10) details extra information on dissection, fixation, sectioning and histopathology of endocrine tissues.
69.
All gross lesions should be examined. To aid in the elucidation of NOAELs, target organs in other dose groups should be examined, particularly in groups claimed to show a NOAEL.
70.
When a satellite group is used, histopathology should be performed on tissues and organs identified as showing effects in the treated groups.
DATA AND REPORTING
Data
71.
Individual animal data should be provided. Additionally, all data should be summarised in tabular form, showing for each test group the number of animals at the start of the test, the number of animals found dead during the test or euthanised for humane reasons, the time of any death or euthanasia, the number of fertile animals, the number of pregnant females, the number of animals showing signs of toxicity, a description of the signs of toxicity observed, including time of onset, duration, and severity of any toxic effects, the types of histopathological changes, and all relevant litter data. A tabular summary report format, which has proven to be very useful for the evaluation of reproductive/developmental effects, is given in Appendix 3.
72.
When possible, numerical results should be evaluated by an appropriate and general acceptable statistical method. Comparisons of the effect along a dose range should avoid the use of multiple t-tests. The statistical methods should be selected during the design of the study. Statistical analysis of AGD and nipple retention should be based on individual pup data, taking litter effects into account. Where appropriate, the litter is the unit of analysis. Statistical analysis of pup body weight should be based on individual pup data, taking litter size into account. Due to the limited dimensions of the study, statistical analyses in the form of tests for "significance" are of limited value for many endpoints, especially reproductive endpoints. Some of the most widely used methods, especially parametric tests for measures of central tendency, are inappropriate. If statistical analyses are used then the method chosen should be appropriate for the distribution of the variable examined and be selected prior to the start of the study.
Evaluation of results
73.
The findings of this toxicity study should be evaluated in terms of the observed effects, necropsy and microscopic findings. The evaluation will include the relationship between the dose of the test chemical and the presence or absence, incidence and severity of abnormalities, including gross lesions, identified target organs, infertility, clinical abnormalities, affected reproductive and litter performance, body weight changes, effects on mortality and any other toxic effects.
74.
Because of the short period of treatment of the male, the histopathology of the testes and epididymides should be considered along with the fertility data, when assessing male reproduction effects. The use of historic control data on reproduction/development (e.g. for litter size, AGD, nipple retention, serum T4 levels), where available, may also be useful as an aid to the interpretation of the study.
75.
For quality control it is proposed that historical control data are collected and that for numerical data coefficients of variation are calculated, especially for the parameters linked with endocrine disrupter detection. These data can be used for comparison purposes when actual studies are evaluated.
Test report
76.
The test report should include the following information:
Test chemical:
—
source, lot number, limit date for use, if available
—
stability of the test chemical, if known.
Mono-constituent substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents
.
Vehicle (if appropriate):
—
justification for choice of vehicle, if other than water.
Test animals:
—
species/strain used;
—
number, age and sex of animals;
—
source, housing conditions, diet, etc.;
—
individual weights of animals at the start of the test.
—
justification for species if not rat
Test conditions:
—
rationale for dose level selection;
—
details of test chemical formulation/diet preparation, achieved concentration, stability and homogeneity of the preparation;
—
details of the administration of the test chemical;
—
conversion from diet/drinking water test chemical concentration (ppm) to the actual dose (mg/kg body weight/day), if applicable;
—
details of food and water quality;
—
detailed description of the randomisation procedure to select pups for culling, if culled.
Results:
—
body weight/body weight changes;
—
food consumption and water consumption, if applicable;
—
toxic response data by sex and dose, including fertility, gestation, and any other signs of toxicity;
—
gestation length;
—
toxic or other effects on reproduction, offspring, postnatal growth, etc.;
—
nature, severity and duration of clinical observations (whether reversible or not);
—
sensory activity, grip strength and motor activity assessments;
—
haematological tests with relevant baseline values;
—
clinical biochemistry tests with relevant base-line values;
—
number of adult females with normal or abnormal oestrous cycle and cycle duration;
—
number of live births and post implantation loss;
—
number of pups with grossly visible abnormalities; gross evaluation of external genitalia, number of runts;
—
time of death during the study or whether animals survived to termination;
—
number of implantations, litter size and litter weights at the time of recording;
—
pup body weight data
—
AGD of all pups (and body weight on day of AGD measurement)
—
nipple retention in male pups,
—
thyroid hormone levels, day 13 pups and adult males (and dams and day 4 pups if measured)
—
body weight at sacrifice and organ weight data for the parental animals;
—
necropsy findings;
—
a detailed description of histopathological findings;
—
absorption data (if available);
—
statistical treatment of results, where appropriate.
Discussion of results.
Conclusions.
Interpretation of Results
77.
The study will provide evaluations of reproduction/developmental toxicity associated with administration of repeated doses. In particular, since emphasis is placed on both general toxicity and reproduction/developmental toxicity endpoints, the results of the study will allow for the discrimination between reproduction/developmental effects occurring in the absence of general toxicity and those which are only expressed at levels that are also toxic to parent animals (see paragraphs 7-11). It could provide an indication of the need to conduct further investigations and could provide guidance in the design of subsequent studies. OECD Guidance Document 43 should be consulted for aid in the interpretation of reproduction and developmental results (19). OECD Guidance Document 106 on Histologic Evaluation of Endocrine and Reproductive Tests in Rodents (16) provides information on the preparation and evaluation of (endocrine) organs and vaginal smears that may be helpful for this test method.
LITERATURE
(1)
OECD (1990). Room Document No 1 for the 14th Joint Meeting of the Chemicals Group and Management Committee. Available upon request at Organisation for Economic Cooperation and Development, Paris
(2)
OECD (1992). Chairman's Report of the ad hoc Expert Meeting on Reproductive Toxicity Screening Methods, Tokyo, 27th-29th October, 1992. Available upon request at Organisation for Economic Cooperation and Development, Paris
(3)
Mitsumori K., Kodama Y., Uchida O., Takada K., Saito M. Naito K., Tanaka S., Kurokawa Y., Usami, M., Kawashima K., Yasuhara K., Toyoda K., Onodera H., Furukawa F., Takahashi M. and Hayashi Y. (1994). Confirmation Study, Using Nitro-Benzene, of the Combined Repeat Dose and Reproductive/ Developmental Toxicity Test Protocol Proposed by the Organization for Economic Cooperation and Development (OECD). J. Toxicol, Sci., 19, 141-149.
(4)
Tanaka S., Kawashima K., Naito K., Usami M., Nakadate M., Imaida K., Takahashi M., Hayashi Y., Kurokawa Y. and Tobe M. (1992). Combined Repeat Dose and Reproductive/Developmental Toxicity Screening Test (OECD): Familiarization Using Cyclophosphamide. Fundam. Appl. Toxicol., 18, 89-95.
(5)
OECD (1998). Report of the First Meeting of the OECD Endocrine Disrupter Testing and Assessment (EDTA) Task Force, 10th-11th March 1998, Available upon request at Organisation for Economic Cooperation and Development, Paris
(6)
OECD (2015). Feasibility Study for Minor Enhancements of TG 421/422 with ED Relevant Endpoints. Environment, Health and Safety Publications, Series on Testing and Assessment (No 217), Organisation for Economic Cooperation and Development, Paris.
(7)
OECD (2000). Guidance Document on the Recognition, Assessment, and Use of Clinical Signs as Humane Endpoints for Experimental Animals Used in Safety Evaluations, Environment, Health and Safety Publications, Series on Testing and Assessment, (No 19), Organisation for Economic Cooperation and Development, Paris.
(8)
Goldman J.M., Murr A.S., Buckalew A.R., Ferrell J.M.and Cooper R.L. (2007). The Rodent Estrous Cycle: Characterization of Vaginal Cytology and its Utility in Toxicological Studies, Birth Defects Research, Part B, 80 (2), 84-97.
(9)
Sadleir R.M.F.S. (1979). Cycles and Seasons, in Auston C.R. and Short R.V. (Eds.), Reproduction in Mammals: I. Germ Cells and Fertilization, Cambridge, New York.
(10)
IPCS (1986). Principles and Methods for the Assessment of Neurotoxicity Associated with Exposure to Chemicals. Environmental Health Criteria Document (No 60).
(11)
Moser V.C., McDaniel K.M. and Phillips P.M. (1991). Rat Strain and Stock Comparisons Using a Functional Observational Battery: Baseline Values and Effects of Amitraz. Toxicol. Appl. Pharmacol., 108, 267-283.
(12)
Meyer O.A., Tilson H.A., Byrd W.C. and Riley M.T. (1979). A Method for the Routine Assessment of Fore- and Hindlimb Grip Strength of Rats and Mice. Neurobehav. Toxicol., 1, 233-236.
(13)
Crofton K.M., Howard J.L., Moser V.C., Gill M.W., Reiter L.W., Tilson H.A., MacPhail R.C. (1991). Interlaboratory Comparison of Motor Activity Experiments: Implication for Neurotoxicological Assessments. Neurotoxicol. Teratol. 13, 599-609.
(14)
Gallavan R.H. Jr, J.F. Holson, D.G. Stump, J.F. Knapp and V.L. Reynolds. (1999). “Interpreting the Toxicologic Significance of Alterations in Anogenital Distance: Potential for Confounding Effects of Progeny Body Weights”, Reproductive Toxicology, 13: 383-390.
(15)
OECD (2013). Guidance Document in Support of the Test Guideline on the Extended One Generation Reproductive Toxicity Study. Environment, Health and Safety Publications, Series on Testing and Assessment (No 151). Organisation for Economic Cooperation and Development, Paris.
(15)
OECD (2009).Guidance Document for Histologic Evaluation of Endocrine and Reproductive Tests in Rodents. Environment, Health and Safety Publications, Series on Testing and Assessment (No. 106) Organisation for Economic Cooperation and Development, Paris.
(17)
Hess RA and Moore BJ. (1993). Histological Methods for the Evaluation of the Testis. In: Methods in Reproductive Toxicology, Chapin RE and Heindel JJ (Eds.). Academic Press: San Diego, CA, pp. 52-85.
(18)
Latendresse JR, Warbrittion AR, Jonassen H, Creasy DM. (2002). Fixation of Testes and Eyes Using a Modified Davidson's Fluid: Comparison with Bouin's Fluid and Conventional Davidson's fluid. Toxicol. Pathol. 30, 524-533.
(19)
OECD (2008). Guidance Document on Mammalian Reproductive Toxicity Testing and Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 43), Organisation for Economic Cooperation and Development, Paris.
(20)
OECD (2011), Guidance Document on Standardised Test Guidelines for Evaluating Chemicals for Endocrine Disruption (No 150), Organisation for Economic Cooperation and Development, Paris.
Appendix 1
DEFINITIONS (SEE ALSO (20) OECD GD 150)
Androgenicity is the capability of a chemical to act like a natural androgenic hormone (e.g. testosterone) in a mammalian organism.
Antiandrogenicity is the capability of a chemical to suppress the action of a natural androgenic hormone (e.g. testosterone) in a mammalian organism.
Antioestrogenicity is the capability of a chemical to suppress the action of a natural oestrogenic hormone (e.g. oestradiol 17ß) in a mammalian organism.
Antithyroid activity is the capability of a chemical to suppress the action of a natural thyroid hormone (e.g. T3) in a mammalian organism.
Chemical is a substance or a mixture.
Developmental toxicity: the manifestation of reproductive toxicity, representing pre-, peri- post-natal, structural, or functional disorders in the progeny.
Dose is the amount of test chemical administered. The dose is expressed as weight of test chemical per unit body weight of test animal per day (e.g. mg/kg body weight/day), or as a constant dietary concentration.
Dosage is a general term comprising dose, its frequency and the duration of dosing.
Evident toxicity is a general term describing clear signs of toxicity following administration of test chemical. These should be sufficient for hazard assessment and should be such that an increase in the dose administered can be expected to result in the development of severe toxic signs and probable mortality.
Impairment of fertility represents disorders of male or female reproductive functions or capacity.
Maternal toxicity: adverse effects on gravid females, occurring either specifically (direct effect) or not specifically (indirect effect) and being related to the gravid state.
NOAEL is the abbreviation for no-observed-adverse-effect level. This is the highest dose level where no adverse treatment-related findings are observed due to treatment.
Oestrogenicity is the capability of a chemical to act like a natural oestrogenic hormone (e.g. oestradiol 17ß) in a mammalian organism.
Reproduction toxicity represents harmful effects on the progeny and/or an impairment of male and female reproductive functions or capacity.
Test chemical is any substance or mixture tested using this test method.
Thyroid activity is the capability of a chemical to act like a natural thyroid hormone (e.g. T3) in a mammalian organism.
Validation is a scientific process designed to characterise the operational requirements and limitations of a test method and to demonstrate its reliability and relevance for a particular purpose.
Appendix 2
DIAGRAM OF THE EXPERIMENTAL SCHEDULE, INDICATING THE MAXIMUM STUDY DURATION, BASED ON A FULL 14-DAY MATING PERIOD
Appendix 3
TABULAR SUMMARY REPORT OF EFFECTS ON REPRODUCTION/DEVELOPMENT
OBSERVATIONS
VALUES
Dosage (units).......
0 (control)
...
...
...
...
Pairs started (N)
Oestrus cycle (at least mean length and frequency of irregular cycles)
Pup weight at the time of AGD measurement(mean males, mean females)
Pup AGD on the same postnatal day, birth- day 4 (mean males, mean females, note PND)
Pup weight at day 4 (mean)
Pup weight at day 13 (mean)
Male pup nipple retention at day 13 (mean)
ABNORMAL PUPS
Dams with 0
Dams with 1
Dams with ≥ 2
LOSS OF OFFSPRING
Pre-natal (implantations minus live births)
Females with 0
Females with 1
Females with 2
Females with ≥ 3
Post-natal (live births minus alive at post natal day 13)
Females with 0
Females with 1
Females with 2
Females with ≥ 3
B.65 IN VITRO MEMBRANE BARRIER TEST METHOD FOR SKIN CORROSION
INTRODUCTION
1.
This test method is equivalent to OECD test guideline (TG) 435 (2015). Skin corrosion refers to the production of irreversible damage to the skin, manifested as visible necrosis through the epidermis and into the dermis, following the application of a test chemical as defined by the United Nations (UN) Globally Harmonized System of Classification and Labelling of Chemicals (GHS) (1) and the European Union (EU) Regulation 1272/2008 on Classification, Labelling and Packaging of Substances and Mixtures (CLP) (24) This test method, equivalent to the updated OECD test guideline 435 provides an in vitro membrane barrier test method that can be used to identify corrosive chemicals. The test method utilises an artificial membrane designed to respond to corrosive chemicals in a manner similar to animal skin in situ.
2.
Skin corrosivity has traditionally been assessed by applying the test chemical to the skin of living animals and assessing the extent of tissue damage after a fixed period of time (2). Besides the present test method, a number of other in vitro test methods have been adopted as alternatives (3)(4) to the standard in vivo rabbit skin procedure (Chapter B.4 of this Annex, equivalent to OECD TG 404) used to identify corrosive chemicals (2). The UN GHS tiered testing and evaluation strategy for the assessment and classification of skin corrosivity and the OECD Guidance Document on Integrated Approaches to Testing and Assessment (IATA) for Skin Irritation/Corrosion recommend the use of validated and accepted in vitro test methods under modules 3 and 4 (1)(5). The IATA describes several modules which group information sources and analysis tools and (i) provides guidance on how to integrate and use existing test and non-test data for the assessment of the skin irritation and skin corrosion potentials of chemicals and (ii) proposes an approach when further testing is needed, including when negative results are found (5). In this modular approach, positive results from in vitro test methods can be used to classify a chemical as corrosive without the need for animal testing, thus reducing and refining the use of animals in and avoiding the pain and distress that might occur if animals were used for this purpose.
3.
Validation studies have been completed for the in vitro membrane barrier model commercially available as Corrositex® (6)(7)(8), showing an overall accuracy to predict skin corrosivity of 79 % (128/163), a sensitivity of 85 % (76/89), and a specificity of 70 % (52/74) for a database of 163 substances and mixtures (7). Based on its acknowledged validity, this validated reference method (VRM) has been recommended for use as part of a tiered testing strategy for assessing the dermal corrosion hazard potential of chemicals (5)(7). Before an in vitro membrane barrier model for skin corrosion can be used for regulatory purposes, its reliability, relevance (accuracy), and limitations for its proposed use should be determined to ensure that it is similar to that of the VRM (9), in accordance with the pre-defined performance standards (PS) (10). The OECD Mutual Acceptance of Data will only be guaranteed after any proposed new or updated method following the PS have been reviewed and included in the equivalent OECD test guideline. Currently, only one in vitro method is covered by OECD test guideline 435 and this test method, the commercially available Corrositex® model.
4.
Other test methods for skin corrosivity testing are based on the use of reconstituted human skin (OECD TG 431) (3) and isolated rat skin (OECD TG 430) (4). This Test Guideline also provides for subcategorisation of corrosive chemicals into the three UN GHS Sub-categories of corrosivity and the three UN Transport Packing Groups for corrosivity hazard. This Test Guideline was originally adopted in 2006 and updated in 2015 to refer to the IATA guidance document and update the list of proficiency substances.
DEFINITIONS
5.
Definitions used are provided in the Appendix.
INITIAL CONSIDERATIONS AND LIMITATIONS
6.
The test described in this test method allows the identification of corrosive test chemicals and allows the sub-categorisation of corrosive test chemicals according to UN GHS/CLP (Table 1). In addition, such a test method may be used to make decisions on the corrosivity and non-corrosivity of specific classes of chemicals, e.g. organic and inorganic acids, acid derivatives (25), and bases for certain transport testing purposes (7)(11)(12). This test method describes a generic procedure similar to the validated reference test method (7). While this test method does not provide adequate information on skin irritation, it should be noted that TM B.46 (equivalent to OECD TG439) specifically addresses the health effect skin irritation in vitro (13). For a full evaluation of local skin effects after a single dermal exposure, the OECD Guidance Document on Integrated Approaches for Testing Assessment should be consulted (5).
Table 1
The UN GHS Skin Corrosive Category and Subcategories (26)
Corrosive Category (category 1) (for authorities not using subcategories)
Potential Corrosive Subcategories (26) (for authorities using subcategories, including the CLP Regulation)
Corrosive in ≥ 1 of 3 animals
Exposure
Observation
Corrosive
Corrosive subcategory 1A
≤ 3 minutes
≤ 1 hour
Corrosive subcategory 1B
> 3 minutes /≤ 1 hour
≤14 days
Corrosive subcategory 1C
> 1 hour /≤ 4 hours
≤ 14 days
7.
A limitation of the validated reference method (7) is that many non-corrosive chemicals and some corrosive chemicals may not qualify for testing, based on the results of the initial compatibility test (see paragraph 13). Aqueous chemicals with a pH in the range of 4.5 to 8.5 often do not qualify for testing; however, 85 % of chemicals tested in this pH range were non-corrosive in animal tests (7). The in vitro membrane barrier method may be used to test solids (soluble or insoluble in water), liquids (aqueous or non-aqueous), and emulsions. However, test chemicals not causing a detectable change in the compatibility test (i.e. colour change in the Chemical Detection System (CDS) of the validated reference test method) cannot be tested with the membrane barrier method and should be tested using other test methods.
PRINCIPLE OF THE TEST
8.
The test system comprises two components: a synthetic macromolecular bio-barrier and a chemical detection system (CDS); this test method detects via the CDS membrane barrier damage caused by corrosive test chemicals after the application of the test chemical to the surface of the synthetic macromolecular membrane barrier (7), presumably by the same mechanism(s) of corrosion that operate on living skin.
9.
Penetration of the membrane barrier (or breakthrough) might be measured by a number of procedures or CDS, including a change in the colour of a pH indicator dye or in some other property of the indicator solution below the barrier.
10.
The membrane barrier should be determined to be valid, i.e. relevant and reliable, for its intended use. This includes ensuring that different preparations are consistent in regard to barrier properties, e.g. capable of maintaining a barrier to non-corrosive chemicals, able to categorise the corrosive properties of chemicals across the various UN GHS Sub-categories of corrosivity (1). The classification assigned is based on the time it takes a chemical to penetrate through the membrane barrier to the indicator solution.
DEMONSTRATION OF PROFICIENCY
11.
Prior to routine use of the in vitro membrane barrier method, adhering to this test method, laboratories should demonstrate technical proficiency by correctly classifying the twelve Proficiency Substances recommended in Table 2. In situations where a listed substance is unavailable or where justifiable, another substance for which adequate in vivo and in vitro reference data are available may be used (e.g. from the list of reference chemicals (10)) provided that the same selection criteria as described in Table 1 is applied.
The following paragraphs describe the components and procedures of an artificial membrane barrier test method for corrosivity assessment (7)(15), based on the current VRM, i.e. the commercially available Corrositex®. The membrane barrier and the compatibility/indicator and categorisation solutions can be constructed, prepared or obtained commercially such as in the case of the VRM Corrositex®. A sample test method protocol for the validated reference test method is available (7). Testing should be performed at ambient temperature (17-25oC) and the components should comply with the following conditions.
Test Chemical Compatibility Test
13.
Prior to performing the membrane barrier test, a compatibility test is performed to determine if the test chemical is detectable by the CDS. If the CDS does not detect the test chemical, the membrane barrier test method is not suitable for evaluating the potential corrosivity of that particular test chemical and a different test method should be used. The CDS and the exposure conditions used for the compatibility test should reflect the exposure in the subsequent membrane barrier test.
Test Chemical Timescale Category Test
14.
If appropriate for the test method, a test chemical that has been qualified by the compatibility test should be subjected to a timescale category test, i.e. a screening test to distinguish between weak and strong acids or bases. For example, in the validated reference test method a timescale categorisation test is used to indicate which of two timescales should be used based on whether significant acid or alkaline reserve is detected. Two different breakthrough timescales should be used for determining corrosivity and UN GHS skin corrosivity Sub-category, based on the acid or alkali reserve of the test chemical.
MEMBRANE BARRIER TEST METHOD COMPONENTS
Membrane Barrier
15.
The membrane barrier consists of two components: a proteinaceous macromolecular aqueous gel and a permeable supporting membrane. The proteinaceous gel should be impervious to liquids and solids but can be corroded and made permeable. The fully constructed membrane barrier should be stored under pre-determined conditions shown to preclude deterioration of the gel, e.g. drying, microbial growth, shifting, cracking, which would degrade its performance. The acceptable storage period should be determined and membrane barrier preparations not used after that period.
16.
The permeable supporting membrane provides mechanical support to the proteinaceous gel during the gelling process and exposure to the test chemical. The supporting membrane should prevent sagging or shifting of the gel and be readily permeable to all test chemicals.
17.
The proteinaceous gel, composed of protein, e.g. keratin, collagen, or mixtures of proteins, forming a gel matrix, serves as the target for the test chemical. The proteinaceous material is placed on the surface of the supporting membrane and allowed to gel prior to placing the membrane barrier over the indicator solution. The proteinaceous gel should be of equal thickness and density throughout, and with no air bubbles or defects that could affect its functional integrity.
Chemical Detection System (CDS)
18.
The indicator solution, which is the same solution used for the compatibility test, should respond to the presence of a test chemical. A pH indicator dye or combination of dyes, e.g. cresol red and methyl orange that will show a colour change, in response to the presence of the test chemical, should be used. The measurement system can be visual or electronic.
19.
Detection systems that are developed for detecting the passage of the test chemical through the barrier membrane should be assessed for their relevance and reliability in order to demonstrate the range of chemicals that can be detected and the quantitative limits of detection.
TEST PERFORMANCE
Assembly of the Test Method Components
20.
The membrane barrier is positioned in a vial (or tube) containing the indicator solution so that the supporting membrane is in full contact with the indicator solution and with no air bubbles present. Care should be taken to ensure that barrier integrity is maintained.
Application of the Test Chemical
21.
A suitable amount of the test chemical, e.g. 500 μl of a liquid or 500 mg of a finely powdered solid (7), is carefully layered onto the upper surface of the membrane barrier and evenly distributed. An appropriate number of replicates, e.g. four (7), is prepared for each test chemical and its corresponding controls (see paragraphs 23 to 25). The time of applying the test chemical to the membrane barrier is recorded. To ensure that short corrosion times are accurately recorded, the application times of the test chemical to the replicate vials are staggered.
Measurement of Membrane Barrier Penetrations
22.
Each vial is appropriately monitored and the time of the first change in the indicator solution, i.e. barrier penetration, is recorded, and the elapsed time between application and penetration of the membrane barrier determined.
Controls
23.
In tests that involve the use of a vehicle or solvent with the test chemical, the vehicle or solvent should be compatible with the membrane barrier system, i.e. not alter the integrity of the membrane barrier system, and should not alter the corrosivity of the test chemical. When applicable, solvent (or vehicle) control should be tested concurrently with the test chemical to demonstrate the compatibility of the solvent with the membrane barrier system.
24.
A positive (corrosive) control with intermediate corrosivity activity, e.g. 110 ± 15 mg sodium hydroxide (UN GHS Corrosive Sub-category 1B) (7), should be tested concurrently with the test chemical to assess if the test system is performing in an acceptable manner. A second positive control that is of the same chemical class as the test chemical may be useful for evaluating the relative corrosivity potential of a corrosive test chemical. Positive control(s) should be selected that are intermediate in their corrosivity (e.g. UN GHS Sub-category 1B) in order to detect changes in the penetration time that may be unacceptably longer or shorter than the established reference value, thereby indicating that the test system is not functioning properly. For this purpose, extremely corrosive (UN GHS Sub-category 1A) or non-corrosive chemicals are of limited utility. A corrosive UN GHS Sub-category 1B chemical would allow detection of a too rapid or too slow breakthrough time. A weakly corrosive (UN GHS Sub-category 1C) might be employed as a positive control to measure the ability of the test method to consistently distinguish between weakly corrosive and non-corrosive chemicals. Regardless of the approach used, an acceptable positive control response range should be developed based on the historical range of breakthrough times for the positive control(s) employed, such as the mean ± 2-3 standard deviations. In each study, the exact breakthrough time should be determined for the positive control so that deviations outside the acceptable range can be detected.
25.
A negative (non-corrosive) control, e.g. 10 % citric acid, 6 % propionic acid (7), should also be tested concurrently with the test chemical as another quality control measure to demonstrate the functional integrity of the membrane barrier.
Study Acceptability Criteria
26.
According to the established time parameters for each of the UN GHS corrosivity Sub-categories, the time (in minutes) elapsed between application of a test chemical to the membrane barrier and barrier penetration is used to predict the corrosivity of the test chemical. For a study to be considered acceptable, the concurrent positive control should give the expected penetration response time (e.g. 8-16 min breakthrough time for sodium hydroxide if used as a positive control), the concurrent negative control should not be corrosive, and, when included, the concurrent solvent control should neither be corrosive nor should it alter the corrosivity potential of the test chemical. Prior to routine use of a method that adheres to this test method, laboratories should demonstrate technical proficiency, using the twelve substances recommended in Table 2. For new “me-too” methods developed under this test method that are structurally and functionally similar to the validated reference method (14) the pre-defined performance standards should be used to demonstrate the reliability and accuracy of the new method prior to its use for regulatory testing (10).
Interpretation of Results and Corrosivity Classification of Test Chemicals
27.
The time (in minutes) elapsed between application of the test chemical to the membrane barrier and barrier penetration is used to classify the test chemical in terms of UN GHS corrosive Sub-categories (1) and, if applicable, UN Packing Group (16). Cut-off time values for each of the three corrosive subcategories are established for each proposed test method. Final decisions on cut-off times should consider the need to minimise under-classification of corrosive hazard (i.e. false negatives). In the present test guideline, the cut-off times of Corrositex® as described in table 3 should be used as it represents the only test method currently falling within the test guideline (7).
Category 1 test chemicals (30)(determined by the method’s categorisation test)
Category 2 test chemicals (31) (determined by the method’s categorisation test)
0-3 min.
0-3 min.
Corrosiveoptional Sub-category 1A
> 3 to 60 min.
> 3 to 30 min.
Corrosiveoptional Sub-category 1B
> 60 to 240 min.
> 30 to 60 min.
Corrosiveoptional Sub-category 1C
> 240 min.
> 60 min.
Non-corrosive
DATA AND REPORTING
Data
28.
The time (in minutes) elapsed between application and barrier penetration for the test chemical and the positive control(s) should be reported in tabular form as individual replicate data, as well as means ± the standard deviation for each trial.
Test Report
29.
The test report should include the following information:
Test Chemical and Control Substances:
—
Mono-constituent substance: chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc;
—
Multi-constituent substance, UVCB and mixture: characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents;
—
Physical appearance, water solubility, and additional relevant physicochemical properties;
—
Source, lot number if available;
—
Treatment of the test chemical/control substance prior to testing, if applicable (e.g. warming, grinding);
—
Stability of the test chemical, limit date for use, or date for re-analysis if known;
In vitro membrane barrier model and protocol used, including demonstrated accuracy and reliability
Test Conditions:
—
Description of the apparatus and preparation procedures used;
—
Source and composition of the in vitro membrane barrier used;
—
Composition and properties of the indicator solution;
—
Method of detection;
—
Test chemical and control substance amounts;
—
Number of replicates;
—
Description and justification for the timescale categorisation test;
—
Method of application;
—
Observation times.
—
Description of the evaluation and classification criteria applied;
—
Demonstration of proficiency in performing the test method before routine use by testing of the proficiency chemicals.
Results:
—
Tabulation of individual raw data from individual test and control samples for each replicate;
—
Descriptions of other effects observed;
—
The derived classification with reference to the prediction model/decision criteria used.
Discussion of the results
Conclusions
LITERATURE
(1)
United Nations (UN) (2013). Globally Harmonized System of Classification and Labelling of Chemicals (GHS), Fifth Revised Edition, UN New York and Geneva, 2013. Available at: http://www.unece.org/trans/danger/publi/ghs/ghs_rev05/05files_e.html
(2)
Chapter B.4 of this Annex, Acute Dermal Irritation, Corrosion.
(3)
Chapter B.40bis of this Annex, In vitro skin corrosion: reconstructed human epidermis (RHE) test method.
(4)
Chapter B.40 of this Annex, In Vitro Skin Corrosion: Transcutaneous Electrical Resistance (TER).
(5)
OECD (2015). Guidance Document on Integrated Approaches to Testing and Assessment of Skin Irritation/Corrosion. Environment, Health and Safety Publications, Series on Testing and Assessment, (No 203). Organisation for Economic Cooperation and Development, Paris.
(6)
Fentem, J.H., Archer, G.E.B., Balls, M., Botham, P.A., Curren, R.D., Earl, L.K., Esdaile, D.J., Holzhutter, H.-G. and Liebsch, M. (1998). The ECVAM International Validation Study on In Vitro Tests for Skin Corrosivity. 2. Results and Evaluation by the Management Team. Toxicology In Vitro 12, 483-524.
(7)
ICCVAM (1999). Corrositex®. An In Vitro Test Method for Assessing Dermal Corrosivity Potential of Chemicals. The Results of an Independent Peer Review Evaluation Coordinated by ICCVAM, NTP and NICEATM. NIEHS, NIH Publication (No 99-4495.)
(8)
Gordon V.C., Harvell J.D. and Maibach H.I. (1994). Dermal Corrosion, the Corrositex® System: A DOT Accepted Method to Predict Corrosivity Potential of Test Materials. In vitro Skin Toxicology-Irritation, Phototoxicity, Sensitization. Alternative Methods in Toxicology 10, 37-45.
(9)
OECD (2005). Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment. Environmental, Health and Safety Publications. Series on testing and Assessment (No 34).
(10)
OECD (2014). Performance Standards for the Assessment of Proposed Similar or Modified In Vitro Membrane Barrier Test Method for Skin Corrosion in Relation to TG 435. Organisation for Economic Cooperation and Development, Paris. Available at: http://www.oecd.org/chemicalsafety/testing/PerfStand-TG430-June14.pdf.
(11)
ECVAM (2001). Statement on the Application of the CORROSITEX Assay for Skin Corrosivity Testing. 15th Meeting of ECVAM Scientific Advisory Committee (ESAC), Ispra, Italy. ATLA 29, 96-97.
(12)
U.S. DOT (2002). Exemption DOT-E-10904 (Fifth Revision). (September 20, 2002). Washington, D.C., U.S. DOT.
(13)
Chapter B.46 of this Annex, In Vitro Skin Irritation: Reconstructed Human Epidermis Test Method. ICCVAM (2004). ICCVAM Recommended Performance Standards for In Vitro Test Methods for Skin Corrosion. NIEHS, NIH Publication No 04-4510. Available at: http://www.ntp.niehs.nih.gov/iccvam/docs/dermal_docs/ps/ps044510.pdf.
(14)
U.S. EPA (1996). Method 1120, Dermal Corrosion. Available at: http://www.epa.gov/osw/hazard/testmethods/sw846/pdfs/1120.pdf.
(15)
United Nations (UN) (2013). UN Recommendations on the Transport of Dangerous Goods, Model Regulations, 18th Revised Edition (Part, Chapter 2.8), UN, 2013. Available at: http://www.unece.org/fileadmin/DAM/trans/danger/publi/unrec/rev18/English/Rev18_Volume1_Part2.pdf.
Appendix
DEFINITIONS
Accuracy: The closeness of agreement between test method results and accepted reference values. It is a measure of test method performance and one aspect of relevance. The term is often used interchangeably with “concordance” to mean the proportion of correct outcomes of a test method (9).
Chemical: A substance or a mixture.
Chemical Detection System (CDS): A visual or electronic measurement system with an indicator solution that responds to the presence of a test chemical, e.g. by a change in a pH indicator dye, or combination of dyes, that will show a colour change in response to the presence of the test chemical or by other types of chemical or electrochemical reactions.
Concordance: This is a measure of test method performance for test methods that give a categorical result, and is one aspect of relevance. The term is sometimes used interchangeably with accuracy, and is defined as the proportion of all chemicals tested that are correctly classified as positive or negative. Concordance is highly dependent on the prevalence of positives in the types of test chemical being examined (9).
GHS (Globally Harmonized System of Classification and Labelling of Chemicals): A system proposing the classification of chemicals (substances and mixtures) according to standardised types and levels of physical, health and environmental hazards, and addressing corresponding communication elements, such as pictograms, signal words, hazard statements, precautionary statements and safety data sheets, so that to convey information on their adverse effects with a view to protect people (including employers, workers, transporters, consumers and emergency responders) and the environment (1).
IATA: Integrated Approach on Testing and Assessment.
Mixture: A mixture or solution composed of two or more substances.
Mono-constituent substance: A substance, defined by its quantitative composition, in which one main constituent is present to at least 80 % (w/w).
Multi-constituent substance: A substance, defined by its quantitative composition, in which more than one main constituent is present in a concentration ≥ 10 % (w/w) and < 80 % (w/w). A multi-constituent substance is the result of a manufacturing process. The difference between mixture and multi-constituent substance is that a mixture is obtained by blending of two or more substances without chemical reaction. A multi-constituent substance is the result of a chemical reaction.
NC: Non corrosive.
Performance standards: Standards, based on a validated test method, that provide a basis for evaluating the comparability of a proposed test method that is mechanistically and functionally similar. Included are (i) essential test method components; (ii) a minimum list of Reference Chemicals selected from among the chemicals used to demonstrate the acceptable performance of the validated test method; and (iii) the similar levels of reliability and accuracy, based on what was obtained for the validated test method, that the proposed test method should demonstrate when evaluated using the minimum list of Reference Chemicals (9).
Relevance: Description of relationship of the test method to the effect of interest and whether it is meaningful and useful for a particular purpose. It is the extent to which the test method correctly measures or predicts the biological effect of interest. Relevance incorporates consideration of the accuracy (concordance) of a test method (9).
Reliability: Measures of the extent that a test method can be performed reproducibly within and between laboratories over time, when performed using the same protocol. It is assessed by calculating intra- and inter-laboratory reproducibility (9).
Sensitivity: The proportion of all positive/active chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results, and is an important consideration in assessing the relevance of a test method (9).
Skin corrosion in vivo: The production of irreversible damage of the skin; namely, visible necrosis through the epidermis and into the dermis, following the application of a test chemical for up to four hours. Corrosive reactions are typified by ulcers, bleeding, bloody scabs, and, by the end of observation at 14 days, by discoloration due to blanching of the skin, complete areas of alopecia, and scars. Histopathology should be considered to evaluate questionable lesions.
Specificity: The proportion of all negative/inactive chemicals that are correctly classified by the test method. It is a measure of accuracy for a test method that produces categorical results and is an important consideration in assessing the relevance of a test method (9).
Substance: A chemical element and its compounds in the natural state or obtained by any production process, inducing any additive necessary to preserve its stability and any impurities deriving from the process used, but excluding any solvent which may be separated without affecting the stability of the substance or changing it composition.
Test chemical: Any substance or mixture tested using this test method.
UVCB: Substances of unknown or variable composition, complex reaction products or biological materials.
B.66 STABLY TRANSFECTED TRANSACTIVATION IN VITRO ASSAYS TO DETECT ESTROGEN RECEPTOR AGONISTS AND ANTAGONISTS
GENERAL INTRODUCTION
OECD Performance-Based Test Guideline
1.
This test method is equivalent to OECD test guideline (TG) 455 (2016). TG 455 is a performance-based test guideline (PBTG), describing the methodology of Stably Transfected Transactivation In Vitro Assays to detect Estrogen Receptor Agonists and Antagonists (ER TA assays). It comprises several mechanistically and functionally similar test methods for the identification of estrogen receptor (i.e. ERα, and/or ERα) agonists and antagonists and should facilitate the development of new similar or modified test methods in accordance with the principles for validation set forth in the OECD Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment (1). The fully validated reference test methods (Appendix 2 and Appendix 3) that provide the basis for this PBTG are:
—
The Stably Transfected TA (STTA) assay (2) using the (h) ERα-HeLa-9903 cell line; and
—
The VM7Luc ER TA assay (3) using the VM7Luc4E2 cell line (33) which predominately expresses hERα with some contribution from hER (4)(5).
For the development and validation of similar assays for the same hazard endpoint, performance standards (PS) (6) (7) are available and should be used. They allow for timely amendment of PBTG 455 so that new similar assays can be added to an updated PBTG; however, similar assays will only be added after review and agreement by OECD that performance standards are met. The assays included in TG 455 can be used indiscriminately to address OECD member countries’ requirements for test results on estrogen receptor transactivation while benefiting from the OECD Mutual Acceptance of Data.
Background and principles of the assays included in this test method
2.
The OECD initiated a high-priority activity in 1998 to revise existing, and to develop new test guidelines for the screening and testing of potential endocrine disrupting chemicals. The OECD conceptual framework (CF) for testing and assessment of potential endocrine disrupting chemicals was revised in 2012. The original and revised CFs are included as Annexes in the OECD Guidance Document on Standardised Test Guidelines for Evaluating Chemicals for Endocrine Disruption (8). The CF comprises five levels, each level corresponding to a different level of biological complexity. The ER Transactivation (TA) assays described in this test method are level 2, which includes in vitro assays providing data about selected endocrine mechanism(s)/pathway(s). This test method is for in vitro Transactivation (TA) assays designed to identify estrogen receptor (ER) agonists and antagonists.
3.
The interaction of estrogens with ERs can affect transcription of estrogen-controlled genes, which can lead to the induction or inhibition of cellular processes, including those necessary for cell proliferation, normal fetal development, and reproductive function (9)(10)(11). Perturbation of normal estrogenic systems may have the potential to trigger adverse effects on normal development (ontogenesis), reproductive health and the integrity of the reproductive system.
4.
In vitro TA assays are based on a direct or indirect interaction of the substances with a specific receptor that regulates the transcription of a reporter gene product. Such assays have been used extensively to evaluate gene expression regulated by specific nuclear receptors, such as ERs (12) (13) (14) (15) (16). They have been proposed for the detection of estrogenic transactivation regulated by the ER (17) (18) (19). There are at least two major subtypes of nuclear ERs, α and β, which are encoded by distinct genes. The respective proteins have different biological functions as well as different tissue distributions and ligand binding affinities (20)(21)(22)(23)(24)(25)(26). Nuclear ERα mediates the classic estrogenic response (27)(28)(29)(30), and therefore most models currently being developed to measure ER activation or inhibition are specific to ERα. The assays are used to identify chemicals that activate (or inhibit) the ER following ligand binding, after which the receptor-ligand complex binds to specific DNAresponse elements and transactivates a reporter gene, resulting in increased cellular expression of a marker protein. Different reporter responses can be used in these assayss. In luciferase based systems, the luciferase enzyme transforms the luciferin substrate to a bioluminescent product that can be quantitatively measured with a luminometer. Other examples of common reporters are fluorescent protein and the LacZ gene, which encodes β-galactosidase, an enzyme that can transform the colourless substrate X-gal (5- bromo-4-chloro-indolyl-galactopyranoside) into a blue product that can be quantified with a spectrophotometer. These reporters can be evaluated quickly and inexpensively with commercially available test kits.
5.
Validation studies of the STTA and the VM7Luc TA assays have demonstrated their relevance and reliability for their intended purpose (3)(4)(5)(30). Performance standards for luminescence-based ER TA assays using breast cells lines are included in ICCVAM Test Method Evaluation Report on the LUMI-CELL® ER (VM7Luc ER TA) Test Method: An In Vitro Assay for Identifying Human Estrogen Receptor Agonist and Antagonist Activity of Chemicals (3). These performance standards have been modified to be applicable to both the STTA and VM7Luc TA assays (2).
6.
Definitions and abbreviations used in this test method are described in Appendix 1.
Scope and limitations related to the TA assays
7.
These assays are being proposed for screening and prioritisation purposes, but can also provide mechanistic information that can be used in a weight of evidence approach. They address TA induced by chemical binding to the ERs in an in vitro system. Thus, results should not be directly extrapolated to the complex signalling and regulation of the intact endocrine system in vivo.
8.
TA mediated by the ERs is considered one of the key mechanisms of endocrine disruption (ED), although there are other mechanisms through which ED can occur, including (i) interactions with other receptors and enzymatic systems within the endocrine system, (ii) hormone synthesis, (iii) metabolic activation and/or inactivation of hormones, (iv) distribution of hormones to target tissues, and (v) clearance of hormones from the body. None of the assays under this test method addresses these modes of action.
9.
This test method addresses the ability of chemicals to activate (i.e. act as agonists) and also to suppress (i.e. act as antagonists) ER- dependent transcription. Some chemicals may, in a cell type-dependent manner, display both agonist and antagonist activity and are known as selective estrogen receptor modulators (SERMs). Chemicals that are negative in these assays could be evaluated in an ER binding assay before concluding that the chemical does not bind to the receptor. In addition, the assays are only likely to inform on the activity of the parent molecule bearing in mind the limited metabolising capacities of the in vitro cell systems. Considering that only single substances were used during the validation, the applicability to test mixtures has not been addressed. The test method is nevertheless theoretically applicable to the testing of multi-constituent substances, UVCBs and mixtures. Before use of the test method on a multi-constituent substance, UVCB or mixture for generating data for an intended regulatory purpose, it should be considered whether, and if so why, it may provide adequate results for that purpose. Such considerations are not needed, when there is a regulatory requirement for testing of the mixture.
10.
For informational purposes, Table 1 provides the agonist test results for the 34 substances that were tested in both of the fully validated reference test methods described in this test method. Of these substances, 26 are classified as definitive ER agonists and 8 negatives based upon published reports, including in vitro assays for ER binding and TA, and/or the uterotrophic assay (2)(3)(18)(31)(32)(33)(34). Table 2 provides the antagonist test results for the 15 substances that were tested in both of the fully validated reference test methods described in this test method. Of these substances, 4 are classified as definitive/presumed ER antagonists and 10 negatives based upon published reports, including in vitro assays for ER binding and TA (2)(3)(18)(31). In reference to the data summarised in Table 1 and Table 2, there was 100 % agreement between the two reference test methods on the classifications of all the substances except for one substance (Mifepristone) for antagonist assay, and each substance was correctly classified as an ER agonist/antagonist or negative. Supplementary information on this group of chemicals as well as additional chemicals tested in the STTA and VM7Luc ER TA assays during the validation studies is provided in the Performance Standards for the ERTA (6)(7), Appendix 2 (Tables 1, 2 and 3).
Table 1
Overview of the Results from STTA and VM7Luc ER TA Assays for Substances Tested in Both Agonist Assays and Classified as ER Agonists (POS) or Negatives (NEG)
Abbreviations: CASRN = Chemical Abstracts Service Registry Number; M = molar; EC50 = half maximal effective concentration of test substance; NEG = negative; POS = positive; NT = Not tested; PC10 (and PC50) = the concentration of a test substance at which the response is 10 % (or 50 % for PC50) of the response induced by the positive control (E2, 1nM) in each plate.
Table 2
Comparison of Results from STTA and VM7Luc ER TA Assays for Substances Tested in Both Antagonist Assays and Classified as ER Antagonists (POS) or Negatives (NEG)
Abbreviations: CASRN = Chemical Abstracts Service Registry Number; M = molar; IC50 = half maximal inhibitory concentration of test substance; NEG = negative; PN = presumed negative; POS = positive; PP = presumed positive.
ER TA ASSAY COMPONENTS
Essential Assay Components
11.
This test method applies to assays using a stably transfected or endogenous ERα receptor and stably transfected reporter gene construct under the control of one or more estrogen response elements; however, other receptors such as ERβ may be present. These are essential assay components.
Controls
12.
The basis for the proposed concurrent reference standards for each of agonist and antagonist assay should be described. Concurrent controls (negative, solvent, and positive), as appropriate, serve as an indication that the assay is operative under the test conditions and provide a basis for experiment-to-experiment comparisons; they are usually part of the acceptability criteria for a given experiment (1).
Standard Quality Control Procedures
13.
Standard quality control procedures should be performed as described for each assay to ensure the cell line remains stable through multiple passages, remains mycoplasma-free (i.e. free of bacterial contamination), and retains the ability to provide the expected ER-mediated responses over time. Cell lines should be further checked for their correct identity as well as for other contaminants (e.g. fungi, yeast and viruses).
Demonstration of Laboratory Proficiency
14.
Prior to testing unknown chemicals with any of the assays under this test method, each laboratory should demonstrate proficiency in using the assay. To demonstrate proficiency, each laboratory should test the 14 proficiency substances listed in Table 3 for the agonist assay and 10 proficiency substances in Table 4 for the antagonist assay. This proficiency testing will also confirm the responsiveness of the test system. The list of proficiency substances is a subset of the reference substances provided in the Performance Standards for the ER TA assays (6). These substances are commercially available, represent the classes of chemicals commonly associated with ER agonist or antagonist activity, exhibit a suitable range of potency expected for ER agonists/antagonists (i.e. strong to weak) and include negatives. Testing of the proficiency substances should be replicated at least twice, on different days. Proficiency is demonstrated by correct classification (positive/negative) of each proficiency substance. Proficiency testing should be repeated by each technician when learning the assays. Dependent on cell type, some of these proficiency substances may behave as SERMs and display activity as both agonists and antagonists. However, the proficiency substances are classified in Tables 3 and 4 by their known predominant activity which should be used for proficiency evaluation.
15.
To demonstrate performance and for quality control purposes each laboratory should compile agonist and antagonist historical databases with reference standard (e.g. 17β-estradiol and tamoxifen), positive and negative control chemicals and solvent control (e.g. DMSO) data. As a start, the database should be generated from at least 10 independent agonist (e.g. 17β-estradiol) and 10 independent antagonist (e.g. tamoxifen) runs. Results from future analyses of these reference standards and solvent controls should be added to enlarge the database to ensure consistency and performance of the bioassay by the laboratory over time.
Table 3
List of (14) Proficiency Substances for agonist assay (53)
Abbreviations: CASRN = Chemical Abstracts Service Registry Number; EC50 = half maximal effective concentration of test substance; NEG = negative; POS = positive;PC10 (and PC50) = the concentration of a test substance at which the response is 10 % (or 50 % for PC50) of the response induced by the positive control (E2, 1nM) in each plate.
Table 4
List of (10) Proficiency Substances for antagonist assay
Abbreviations: CASRN = Chemical Abstracts Service Registry Number; M = molar; IC50 = half maximal inhibitory concentration of test substance; NEG = negative; PN = presumed negative; POS = positive.
Test Run Acceptability Criteria
16.
Acceptance or rejection of a test run is based on the evaluation of results obtained for the reference standards and controls used for each experiment. Values for the PC50 (EC50) or IC50 for the reference standards should meet the acceptability criteria as provided for the selected assay (for STTA see Appendix 2, for VM7Luc ER TA see Appendix 3), and all positive/negative controls should be correctly classified for each accepted experiment. The ability to consistently conduct the assay should be demonstrated by the development and maintenance of a historical database for the reference standards and controls (see paragraph 15). Standard deviations (SD) or coefficients of variation (CV) for the means of reference standards curve fitting parameters from multiple experiments may be used as a measure of within-laboratory reproducibility. In addition, the following principles regarding acceptability criteria should be met:
—
Data should be sufficient for a quantitative assessment of ER activation (for agonist assay) or suppression (for antagonist assay) (i.e. efficacy and potency).
—
The mean reporter activity for the reference concentration of reference estrogen should be at least the minimum specified in the assays relative to that of the vehicle (solvent) control to ensure adequate sensitivity. For the STTA and VM7Luc ER TA assays, this is four times that of the mean vehicle control on each plate.
—
The concentrations tested should remain within the solubility range of the test chemicals and not demonstrate cytotoxicity.
Analysis of data
17.
The defined data interpretation procedure for each assay should be used for classifying a positive and negative response.
18.
Meeting the acceptability criteria (paragraph 16) indicates the assay is operating properly, but it does not ensure that any particular test run will produce accurate data. Replicating the results of the first run is the best indication that accurate data were produced. If two runs give reproducible results (e.g. both test run results indicate a test chemical is positive), it is not necessary to conduct a third run.
19.
If two runs do not give reproducible results (e.g. a test chemical is positive in one run and negative in the other run), or if a higher degree of certainty is required regarding the outcome of this assay, at least three independent runs should be conducted. In this case the classification is based on the two concordant results out of the three.
General Data Interpretation Criteria
20.
There is currently no universally agreed method for interpreting ER TA data. However, both qualitative (e.g. positive/negative) and/or quantitative (e.g. EC50, PC50, IC50) assessments of ER-mediated activity should be based on empirical data and sound scientific judgment. Where possible, positive results should be characterised by both the magnitude of the effect as compared to the vehicle (solvent) control or reference estrogen and the concentration at which the effect occurs (e.g. an EC50, PC50, RPCMax, IC50, etc.).
Test Report
21.
The test report should include the following information:
Assay:
—
assay used;
—
control/Reference standard/Test chemical;
—
source, lot number, limit date for use, if available;
—
stability of the test chemical itself, if known;
—
solubility and stability of the test chemical in solvent, if known;
—
measurement of pH, osmolality and precipitate in the culture medium to which the test chemical was added, as appropriate.
Mono-constituent substance:
—
physical appearance, water solubility, and additional relevant physicochemical properties;
—
chemical identification, such as IUPAC or CAS name, CAS number, SMILES or InChI code, structural formula, purity, chemical identity of impurities as appropriate and practically feasible, etc.
Multi-constituent substance, UVCBs and mixtures:
—
characterised as far as possible by chemical identity (see above), quantitative occurrence and relevant physicochemical properties of the constituents.
Solvent/Vehicle:
—
characterisation (nature, supplier and lot);
—
justification for choice of solvent/vehicle;
—
solubility and stability of the test chemical in solvent/vehicle, if known.
Cells:
—
type and source of cells:
—
Is ER endogenously expressed? If not, which receptor(s) were transfected?
—
Reporter construct(s) used (including source species);
—
Transfection method;
—
Selection method for maintenance of stable transfection (where applicable);
—
Is the transfection method relevant for stable lines?
—
number of cell passages (from thawing);
—
passage number of cells at thawing;
—
methods for maintenance of cell cultures.
Test conditions:
—
solubility limitations;
—
description of the methods of assessing viability applied;
—
composition of media, CO2 concentration;
—
concentrations of test chemical;
—
volume of vehicle and test chemical added;
—
incubation temperature and humidity;
—
duration of treatment;
—
cell density at the start of - and during treatment;
—
positive and negative reference standards;
—
reporter reagents (product name, supplier and lot);
—
criteria for considering test runs as positive, negative or equivocal.
Acceptability check:
—
fold inductions for each assay plate and whether they meet the minimum required by the assay based on historical controls;
—
actual values for acceptability criteria, e.g. log10EC50, log10PC50, logIC50 and Hillslope values, for concurrent positive controls/reference standards.
Results:
—
raw and normalised data;
—
the maximum fold induction level;
—
cytotoxicity data;
—
if it exists, the lowest effective concentration (LEC);
—
RPCMax, PCMax, PC50, IC50 and/or EC50 values, as appropriate;
—
concentration-response relationship, where possible;
—
statistical analyses, if any, together with a measure of error and confidence (e.g. SEM, SD, CV or 95 % CI) and a description of how these values were obtained.
Discussion of the results
Conclusion
LITERATURE
(1)
OECD (2005). Guidance Document on the Validation and International Acceptance of New or Updated Test Methods for Hazard Assessment. Environment, Health and Safety Publications, Series on Testing and Assessment (No 34.), Organisation for Economic Cooperation and Development, Paris.
(2)
OECD (2015). Report of the Inter-Laboratory Validation for Stably Transfected Transactivation Assay to detect Estrogenic and Anti-estrogenic Activity. Environment, Health and Safety Publications, Series on Testing and Assessment (No 225), Organisation for Economic Cooperation and Development, Paris.
(3)
ICCVAM (2011). ICCVAM Test Method Evaluation Report on the LUMI-CELL® ER (BG1Luc ER TA) Test Method, an In Vitro Method for Identifying ER Agonists and Antagonists, National Institute of Environmental Health Sciences: Research Triangle Park, NC.
(4)
Pujol P. et al. (1998). Differential Expression of Estrogen Receptor-Alpha and -Beta Messenger RNAs as a Potential Marker of Ovarian Carcinogenesis, Cancer. Res., 58(23): p. 5367-73.
(5)
Rogers J.M. and Denison M.S. (2000). Recombinant Cell Bioassays for Endocrine Disruptors: Development of a Stably Transfected Human Ovarian Cell Line for the Detection of Estrogenic and Anti-Estrogenic Chemicals, In Vitro and Molecular Toxicology: Journal of Basic and Applied Research, 13(1): p. 67-82.
(6)
OECD (2012). Performance Standards For Stably Transfected Transactivation In Vitro Assay to Detect Estrogen Receptor Agonists (for TG 455). Environment, Health and Safety Publications, Series on Testing and Assessment (No 173.), Organisation for Economic Cooperation and Development, Paris.
(7)
OECD (2015). Performance Standards For Stably Transfected Transactivation In Vitro Assay to Detect Estrogen Receptor Antagonists. Environment, Health and Safety Publications, Series on Testing and Assessment (No 174.), Organisation for Economic Cooperation and Development, Paris.
(8)
OECD (2012). Guidance Document on Standardized Test Guidelines for Evaluating Chemicals for Endocrine Disruption. Environment, Health and Safety Publications, Series on Testing and Assessment (No 150.), Organisation for Economic Cooperation and Development, Paris.
(9)
Cavailles V. (2002). Estrogens and Receptors: an Evolving Concept. Climacteric, 5 Suppl 2: p. 20- 6.
(10)
Welboren W.J. et al. (2009). Genomic Actions of Estrogen Receptor Alpha: What are the Targets and how are they Regulated? Endocr. Relat. Cancer, 16(4): p. 1073-89.
(11)
Younes M. and Honma N. (2011). Estrogen Receptor Beta, Arch. Pathol. Lab. Med., 135(1): p. 63- 6.
(12)
Jefferson W.N., et al. (2002). Assessing Estrogenic Activity of Phytochemicals Using Transcriptional Activation and Immature Mouse Uterotrophic Responses, Journal of Chromatography B, 777(1-2): p. 179-189.
(13)
Sonneveld E. et al. (2006). Comparison of In Vitro and In Vivo Screening Models for Androgenic and Estrogenic Activities, Toxicol. Sci., 89(1): p. 173-187.
(14)
Takeyoshi M. et al. (2002). The Efficacy of Endocrine Disruptor Screening Tests in Detecting Anti- Estrogenic Effects Downstream of Receptor-Ligand Interactions, Toxicology Letters, 126(2): p. 91- 98.
(15)
Combes R.D. (2000). Endocrine Disruptors: a Critical Review of In Vitro and In Vivo Testing Strategies for Assessing their Toxic Hazard to Humans, ATLA Alternatives to Laboratory Animals, 28(1): p. 81-118.
(16)
Escande A. et al. (2006). Evaluation of Ligand Selectivity Using Reporter Cell Lines Stably Expressing Estrogen Receptor Alpha or Beta, Biochem. Pharmacol, 71(10): p. 1459-69.
(17)
Gray L.E. Jr. (1998). Tiered Screening and Testing Strategy for Xenoestrogens and Antiandrogens, Toxicol. Lett, 102-103, 677-680.
(18)
EDSTAC (1998). Endocrine Disruptor Screening and Testing Advisory Committee (EDSTAC) Final Report.
(19)
ICCVAM (2003). ICCVAM Evaluation of In Vitro Test Methods for Detecting Potential Endocrine Disruptors: Estrogen Receptor and Androgen Receptor Binding and Transcriptional Activation Assays.
(20)
Gustafsson J.Ö. (1999). Estrogen Receptor ß - A New Dimension in Estrogen Mechanism of Action, Journal of Endocrinology, 163(3): p. 379-383.
(21)
Ogawa S. et al. (1998). The Complete Primary Structure of Human Estrogen Receptor ß (hERß) and its Heterodimerization with ER In Vivo and In Vitro, Biochemical and Biophysical Research Communications, 243(1): p. 122-126.
(22)
Enmark E. et al. (1997). Human Estrogen Receptor ß-Gene Structure, Chromosomal Localization, and Expression Pattern, Journal of Clinical Endocrinology and Metabolism, 82(12): p. 4258-4265.
(23)
Ball L.J. et al. (2009). Cell Type- and Estrogen Receptor-Subtype Specific Regulation of Selective Estrogen Receptor Modulator Regulatory Elements, Molecular and Cellular Endocrinology, 299(2): p. 204-211.
(24)
Barkhem T. et al. (1998). Differential Response of Estrogen Receptor Alpha and Estrogen Receptor Beta to Partial Estrogen Agonists/Antagonists, Mol. Pharmacol, 54(1): p. 105-12.
(25)
Deroo B.J. and Buensuceso A.V. (2010). Minireview: Estrogen Receptor-ß: Mechanistic Insights from Recent Studies, Molecular Endocrinology, 24(9): p. 1703-1714.
(26)
Harris D.M. et al. (2005). Phytoestrogens Induce Differential Estrogen Receptor Alpha- or Beta- Mediated Responses in Transfected Breast Cancer Cells, Experimental Biology and Medicine, 230(8): p. 558-568.
(27)
Anderson J.N. Clark J.H. and Peck E.J.Jr. (1972). The Relationship Between Nuclear Receptor- Estrogen Binding and Uterotrophic Responses, Biochemical and Biophysical Research Communications, 48(6): p. 1460-1468.
(28)
Toft D. (1972). The Interaction of Uterine Estrogen Receptors with DNA, Journal of Steroid Biochemistry, 3(3): p. 515-522.
(29)
Gorski J. et al. (1968), Hormone Receptors: Studies on the Interaction of Estrogen with the Uterus, Recent Progress in Hormone Research, 24: p. 45-80.
(30)
Jensen E.V. et al. (1967), Estrogen-Receptor Interactions in Target Tissues, Archives d’Anatomie Microscopique et de Morphologie Experimentale, 56(3):p. 547-569.
(31)
ICCVAM (2002). Background Review Document: Estrogen Receptor Transcriptional Activation (TA) Assay. Appendix D, Substances Tested in the ER TA Assay, NIH Publication Report (No 03-4505.).
(32)
Kanno J. et al. (2001). The OECD Program to Validate the Rat Uterotrophic Bioassay to Screen Compounds for In Vivo Estrogenic Responses: Phase 1, Environ. Health Persp., 109:785-94.
(33)
Kanno J. et al. (2003). The OECD Program to Validate the Rat Uterotrophic Bioassay: Phase Two Dose -Response Studies, Environ. Health Persp., 111:1530-1549.
(34)
Kanno J. et al. (2003), The OECD Program to Validate the Rat Uterotrophic Bioassay: Phase Two – Coded Single-Dose Studies, Environ. Health Persp., 111:1550-1558.
(35)
Geisinger et al. (1989) Characterization of a human ovarian carcinoma cell line with estrogen and progesterone receptors, Cancer 63, 280-288.
(36)
Baldwin et al. (1998) BG-1 ovarian cell line: an alternative model for examining estrogen-dependent growth in vitro, In Vitro Cell. Dev. Biol. – Animal, 34, 649-654.
(37)
Li, Y. et al. (2014) Research resource: STR DNA profile and gene expression comparisons of human BG-1 cells and a BG-1/MCF-7 clonal variant, Mol. Endo. 28, 2072-2081.
(38)
Rogers, J.M. and Denison, M.S. (2000) Recombinant cell bioassays for endocrine disruptors: development of a stably transfected human ovarian cell line for the detection of estrogenic and anti-estrogenic chemicals, In Vitro & Molec. Toxicol. 13, 67-82.
Appendix 1
DEFINITIONS AND ABBREVIATIONS
Acceptability criteria: Minimum standards for the performance of experimental controls and reference standards. All acceptability criteria should be met for an experiment to be considered valid.
Accuracy (concordance): The closeness of agreement between assay results and an accepted reference values. It is a measure of assay performance and one aspect of relevance. The term is often used interchangeably with “concordance” to mean the proportion of correct outcomes of a assay (1).
Agonist: A substance that produces a response, e.g. transcription, when it binds to a specific receptor.
Antagonist: A type of receptor ligand or chemical that does not provoke a biological response itself upon binding to a receptor, but blocks or dampens agonist-mediated responses.
Anti-estrogenic activity: the capability of a chemical to suppress the action of 17β-estradiol mediated through estrogen receptors.
Cell morphology: The shape and appearance of cells grown in a monolayer in a single well of a tissue culture plate. Cells that are dying often exhibit abnormal cell morphology.
CF: The OECD Conceptual Framework for the Testing and Evaluation of Endocrine Disrupters.
Charcoal/dextran treatment: Treatment of serum used in cell culture. Treatment with charcoal/dextran (often referred to as “stripping”) removes endogenous hormones and hormone-binding proteins.
Chemical: A substance or a mixture.
Cytotoxicity: Harmful effects to cell structure or function that can ultimately cause cell death and can be reflected by a reduction in the number of cells present in the well at the end of the exposure period or a reduction of the capacity for a measure of cellular function when compared to the concurrent vehicle control.
EC50: The half maximal effective concentration of a test chemical.
ED: Endocrine disruption
hERα: Human estrogen receptor alpha
hERß: Human estrogen receptor beta
EFM: Estrogen-free medium. Dulbecco’s Modification of Eagle’s Medium (DMEM) supplemented with 4.5 % charcoal/dextran-treated FBS, 1,9 % L-glutamine, and 0,9 % Pen-Strep.
ER: Estrogen receptor
ERE: Estrogen response element
Estrogenic activity: The capability of a chemical to mimic 17β-estradiol in its ability to bind to and activate estrogen receptors. hERα-mediated estrogenic activity can be detected with this test method.
ERTA: Estrogen Receptor Trans Activation
FBS: Fetal bovine serum
HeLa: An immortal human cervical cell line
HeLa9903: A HeLa cell subclone into which hERα and a luciferase reporter gene have been stably transfected
IC50: The half maximal effective concentration of an inhibitory test chemical.
ICCVAM: The Interagency Coordinating Committee on the Validation of Alternative Methods.
Inter-laboratory reproducibility: A measure of the extent to which different qualified laboratories, using the same protocol and testing the same substances, can produce qualitatively and quantitatively similar results. Interlaboratory reproducibility is determined during the prevalidation and validation processes, and indicates the extent to which an assay can be successfully transferred between laboratories, also referred to as between-laboratory reproducibility (1).
Intra-laboratory reproducibility: A determination of the extent that qualified people within the same laboratory can successfully replicate results using a specific protocol at different times. Also referred to as “within-laboratory reproducibility” (1).
LEC: Lowest effective concentration is the lowest concentration of test chemical that produces a response (i.e. the lowest test chemical concentration at which the fold induction is statistically different from the concurrent vehicle control).
Me-too test: A colloquial expression for an assay that is structurally and functionally similar to a validated and accepted reference test method. Interchangeably used with similar test method.
MT: Metallothionein
MMTV: Mouse Mammary Tumor Virus
OHT: 4-Hydroxytamoxifen
PBTG: Performance-Based Test Guideline
PC (Positive control): a strongly active substance, preferably 17ß-estradiol that is included in all tests to help ensure proper functioning of the assay.
PC10: the concentration of a test chemical at which the measured activity in an agonist assay is 10 % of the maximum activity induced by the PC (E2 at 1nM for the STTA assay) in each plate.
PC50: the concentration of a test chemical at which the measured activity in an agonist assay is 50 % of the maximum activity induced by the PC (E2 at the reference concentration specified in the test method) in each plate.
PCMax: the concentration of a test chemical inducing the RPCMax
Performance standards: Standards, based on a validated assay, that provide a basis for evaluating the comparability of a proposed assay that is mechanistically and functionally similar. Included are (1) essential assay components; (2) a minimum list of reference chemicals selected from among the chemicals used to demonstrate the acceptable performance of the validated test method; and (3) the comparable levels of accuracy and reliability, based on what was obtained for the validated test method, that the proposed assay should demonstrate when evaluated using the minimum list of reference chemicals (1).
Proficiency substances: A subset of the reference substances included in the Performance Standards that can be used by laboratories to demonstrate technical competence with a standardised test method. Selection criteria for these substances typically include that they represent the range of responses, are commercially available, and have high quality reference data available.
Proficiency: The demonstrated ability to properly conduct an assay prior to testing unknown substances.
Reference estrogen (Positive control, PC): 17β-estradiol (E2, CAS 50-28-2).
Reference standard: a reference substance used to demonstrate the adequacy of a assay. 17β-estradiol is the reference standard for the STTA and VM7Luc ER TA assays.
Reference test methods: The assays upon which PBTG 455 is based.
Relevance: Description of relationship of an assay to the effect of interest and whether it is meaningful and useful for a particular purpose. It is the extent to which the assay correctly measures or predicts the biological effect of interest. Relevance incorporates consideration of the accuracy (concordance) of an assay (1).
Reliability: Measure of the extent that an assay can be performed reproducibly within and between laboratories over time, when performed using the same protocol. It is assessed by calculating intra- and inter-laboratory reproducibility.
RLU: Relative Light Units
RNA: Ribonucleic Acid
RPCMax: maximum level of response induced by a test chemical, expressed as a percentage of the response induced by 1 nM E2 on the same plate
RPMI: RPMI 1640 medium supplemented with 0,9 % Pen-Strep and 8.0 % fetal bovine serum (FBS)
Run: An individual experiment that evaluates chemical action on the biological outcome of the assay. Each run is a complete experiment performed on replicate wells of cells plated from a common pool of cells at the same time.
Independent run: A separate, independent experiment that evaluates chemical action on the biological outcome of the assay, using cells from a different pool, freshly diluted chemicals, conducted on different days or on the same day by different staff.
SD: Standard deviation.
Sensitivity: The proportion of all positive/active substances that are correctly classified by the assay. It is a measure of accuracy for an assay that produces categorical results, and is an important consideration in assessing the relevance of an assay (1).
Specificity: The proportion of all negative/inactive substances that are correctly classified by the test. It is a measure of accuracy for an assay that produces categorical results, and is an important consideration in assessing the relevance of an assay (1).
Stable transfection: When DNA is transfected into cultured cells in such a way that it is stably integrated into the cells genome, resulting in the stable expression of transfected genes. Clones of stably transfected cells are selected by stable markers (e.g. resistance to G418).
STTA Assay: Stably Transfected Transactivation Assay, the ERα transcriptional activation assay using the HeLa 9903 Cell Line.
Study: The full range of experimental work performed to evaluate a single, specific substance using a specific assay. A study comprises all steps including tests of dilution of test substance in the test media, preliminary range finding runs, all necessary comprehensive runs, data analyses, quality assurance, cytotoxicity assessments, etc. Completion of a study allows the classification of the test chemical activity on the toxicity target (i.e. active, inactive or inconclusive) that is evaluated by the assay used and an estimate of potency relative to the positive reference chemical.
Substance: Under REACH (61), a substance is defined as a chemical element and its compounds in the natural state or obtained by any manufacturing process, including any additive necessary to preserve its stability and any impurities deriving from the process used, but excluding any solvent which may be separated without affecting the stability of the substance or changing its composition. A very similar definition is used in the context of the UN GHS (1).
TA (Transactivation): The initiation of mRNA synthesis in response to a specific chemical signal, such as a binding of an estrogen to the estrogen receptor
Assay: Within the context of this test method, an assay is one of the methodologies accepted as valid in meeting the outlined performance criteria. Components of assay include, for example, the specific cell line with associated growth conditions, specific media in which the test is conducted, plate set up conditions, arrangement and dilutions of test chemicals along with any other required quality control measures and associated data evaluation steps.
Test chemical: Any substance or mixture tested using this test method.
Transcription: mRNA synthesis
UVCB: Chemical Substances of Unknown or Variable Composition, Complex Reaction Products and Biological Materials
Validated test method: An assay for which validation studies have been completed to determine the relevance (including accuracy) and reliability for a specific purpose. It is important to note that a validated test method may not have sufficient performance in terms of accuracy and reliability to be found acceptable for the proposed purpose (1).
Validation: The process by which the reliability and relevance of a particular approach, method, assay, process or assessment is established for a defined purpose (1).
VC (Vehicle control): The solvent that is used to dissolve test and control chemicals is tested solely as vehicle without dissolved chemical.
VM7: An immortalised adenocarcinoma cell that endogenously express estrogen receptor.
VM7Luc4E2: The VM7Luc4E2 cell line was derived from VM7 immortalised human-derived adenocarcinoma cells that endogenously express both forms of the estrogen receptor (ERα and ERβ) and have been stably transfected with the plasmid pGudLuc7.ERE. This plasmid contains four copies of a synthetic oligonucleotide containing the estrogen response element upstream of the mouse mammary tumor viral (MMTV) promoter and the firefly luciferase gene.
Weak positive control: A weakly active substance selected from the reference chemicals list that is included in all tests to help ensure proper functioning of the assay.
Appendix 2
STABLY TRANSFECTED HUMAN ESTROGEN RECEPTOR-Α TRANSACTIVATION ASSAY FOR DETECTION OF ESTROGENIC AGONIST AND ANTAGONIST ACTIVITY OF CHEMICALS USING THE HERΑ-HELA-9903 CELL LINE
INITIAL CONSIDERATIONS AND LIMITATIONS (SEE ALSO GENERAL INTRODUCTION)
1.
This transactivation (TA) assay uses the hERα-HeLa-9903 cell line to detect estrogenic agonist activity mediated through human estrogen receptor alpha (hERα). The validation study of the Stably Transfected Transactivation (STTA) Assay by the Japanese Chemicals Evaluation and Research Institute (CERI) using the hERα-HeLa-9903 cell line to detect estrogenic agonist and antagonist activity mediated through human estrogen receptor alpha (hERα) demonstrated the relevance and reliability of the assay for its intended purpose (1).
2.
This assay is specifically designed to detect hERα-mediated TA by measuring chemiluminescence as the endpoint. However, non-receptor-mediated luminescence signals have been reported at phytoestrogen concentrations higher than 1 μM due to the over-activation of the luciferase reporter gene (2) (3). While the dose-response curve indicates that true activation of the ER system occurs at lower concentrations, luciferase expression obtained at high concentrations of phytoestrogens or similar compounds suspected of producing phytoestrogen-like over-activation of the luciferase reporter gene needs to be examined carefully in stably transfected ER TA assay systems (Appendix 1).
3.
The sections “GENERAL INTRODUCTION” and “ER TA ASSAY COMPONENTS” should be read before using this assay for regulatory purposes. Definitions and abbreviations used in this TG are described in Appendix 2.1.
PRINCIPLE OF THE ASSAY (SEE ALSO GENERAL INTRODUCTION)
4.
The assay is used to signal binding of the estrogen receptor with a ligand. Following ligand binding, the receptor-ligand complex translocates to the nucleus where it binds specific DNA response elements and transactivates a firefly luciferase reporter gene, resulting in increased cellular expression of luciferase enzyme. Luciferin is a substrate that is transformed by the luciferase enzyme to a bioluminescence product that can be quantitatively measured with a luminometer. Luciferase activity can be evaluated quickly and inexpensively with a number of commercially available test kits.
5.
The test system utilises the hERα-HeLa-9903 cell line, which is derived from a human cervical tumor, with two stably inserted constructs: (i) the hERα expression construct (encoding the full-length human receptor), and (ii) a firefly luciferase reporter construct bearing five tandem repeats of a vitellogenin Estrogen-Responsive Element (ERE) driven by a mouse metallothionein (MT) promoter TATA element. The mouse MT TATA gene construct has been shown to have the best performance, and so is commonly used. Consequently this hERα-HeLa-9903 cell line can measure the ability of a test chemical to induce hERα-mediated transactivation of luciferase gene expression.
6.
In case of ER agonist assay, data interpretation is based upon whether or not the maximum response level induced by a test chemical equals or exceeds an agonist response equal to 10 % of that induced by a maximally inducing (1 nM) concentration of the positive control (PC) 17β-estradiol (E2) (i.e. the PC10). In case of ER antagonist assay, data interpretation is based upon whether or not the response shows at least a 30 % reduction in activity from the response induced by the spike in control (25 pM of E2) without cytotoxicity. Data analysis and interpretation are discussed in detail in paragraphs 34 - 48.
PROCEDURE
Cell Lines
7.
The stably transfected hERα-HeLa-9903 cell line should be used for the assay. The cell line can be obtained from the Japanese Collection of Research Bioresources (JCRB) Cell Bank (62), upon signing a Material Transfer Agreement (MTA).
8.
Only cells characterised as mycoplasma-free should be used in testing. RT-PCR (Real Time Polymerase Chain Reaction) is the method of choice for a sensitive detection of mycoplasma infection (4) (5) (6).
Stability of the cell line
9.
To monitor the stability of the cell line, E2, 17α-estradiol, 17α-methyltestosterone and corticosterone should be used as the reference standards for agonist assay and a complete concentration-response curve in the test concentration range provided in Table 1 should be measured at least once each time the assay is performed, and the results should be in agreement with the results provided in Table 1.
10.
In case of antagonist assay, complete concentration curves for two reference standards, tamoxifen and flutamide, should be measured simultaneously with each run. Correct qualitative classification as positive or negative for the two chemicals should be monitored.
Cell Culture and Plating Conditions
11.
Cells should be maintained in Eagle’s Minimum Essential Medium (EMEM) without phenol red, supplemented with 60 mg/l of antibiotic kanamycine and 10 % dextran-coated-charcoal-treated fetal bovine serum (DCC-FBS), in a CO2 incubator (5 % CO2) at 37±1°C. Upon reaching 75 -90 % confluency, cells can be subcultured at 10 ml of 0,4 x 105 – 1 x 105 cells/ml for 100 mm cell culture dish. Cells should be suspended with 10 % FBS-EMEM (which is the same as EMEM with DCC-FBS) and then plated into wells of a microplate at a density of 1 x 104 cells/(100 μl x well). Next, the cells should be pre-incubated in a 5 % CO2 incubator at 37°±1°C for 3 hours before the chemical exposure. The plastic-ware should be free of estrogenic activity.
12.
To maintain the integrity of the response, the cells should be grown for more than one passage from the frozen stock in the conditioned media and should not be cultured for more than 40 passages. For the hERα-HeLa-9903 cell line, this will be less than three months. However the performance of cells may be reduced if they are grown in inappropriate culture conditions.
13.
The DCC-FBS can be prepared as described in Appendix 2.2, or obtained from commercial sources.
Acceptability criteria
Positive and negative reference standards for ER agonist assay
14.
Prior to and during the study, the responsiveness of the test system should be verified using the appropriate concentrations of a strong estrogen: E2, a weak estrogen (17α-estradiol), a very weak agonist (17α-methyltestosterone), and a negative substance (corticosterone). Acceptable range values derived from the validation study (1) are given in Table 1. These 4 concurrent reference standards should be included with each experiment and the results should fall within the given acceptable limits. If this is not the case, the cause for the failure to meet the acceptability criteria should be determined (e.g. cell handling, and serum and antibiotics for quality and concentration) andthe assay repeated. Once the acceptability criteria have been achieved, to ensure minimum variability of EC50, PC50 and PC10 values, consistent use of materials for cell culturing is essential. The four concurrent reference standards, which should be included in each experiment (conducted under the same conditions including the materials, passage level of cells and technicians), can ensure the sensitivity of the assay because the PC10s of the three positive reference standards should fall within the acceptable range, as should the PC50s and EC50s where they can be calculated (see Table 1).
Table 1
Acceptable range values of the four reference standards for the ER agonist assay
Name
logPC50
logPC10
logEC50
Hill slope
Test range
17β-estradiol (E2) CAS No: 50-28-2
-11,4~-10,1
<-11
-11,3~-10,1
0,7~1,5
10-14~10-8M
17α-estradiol CAS No: 57-91-0
-9,6~-8,1
-10,7~-9,3
-9,6~-8,4
0,9~2,0
10-12~10-6M
CorticosteroneCAS No: 50-22-6
—
—
—
—
10-10~10-4M
17α-methyltestosterone CAS No: 58-18-4
-6,0~-5,1
-8,0~-6,2
—
—
10-11~10-5M
Positive and negative reference standards for ER antagonist assay
15.
Prior to and during the study, the responsiveness of the test system should be verified using the appropriate concentrations of a positive substance (Tamoxifen), and a negative substance (Flutamide). Acceptable range values derived from the validation study (1) are given in Table 2. These two concurrent reference standards should be included with each experiment and the results should be judged correct as shown in the criteria. If this is not the case, the cause for the failure to meet the criteria should be determined (e.g. cell handling, and serum and antibiotics for quality and concentration) and the assay repeated. In addition, IC50 values for a positive substance (Tamoxifen) should be calculated and the results should fall within the given acceptable limits. Once the acceptability criteria have been achieved, to ensure minimum variability of IC50 values, consistent use of materials for cell culturing is essential. The two concurrent reference standards, which should be included in each experiment (conducted under the same conditions including the materials, passage level of cells and technicians), can ensure the sensitivity of the assay (see Table 2).
Table 2
Criteria and acceptable range values of the two reference standards for the ER antagonist assay
Name
Criteria
LogIC50
Test range
Tamoxifen CAS No: 10540-29-1
Positive: IC50 should be calculated
-5,942 ~ -7,596
10-10 ~ 10-5 M
Flutamide CAS No: 13311-84-7
Negative: IC30 should not be calculated
—
10-10 ~ 10-5 M
Positive and Vehicle Controls
16.
The positive control (PC) for ER agonist assay (1 nM of E2) and for ER antagonist assay (10μM TAM) should be tested at least in triplicate in each plate. The vehicle that is used to dissolve a test chemical should be tested as a vehicle control (VC) at least in triplicate in each plate. In addition to this VC, if the PC uses a different vehicle than the test chemical, another VC should be tested at least in triplicate on the same plate with the PC.
Quality criteria for ER agonist assay
17.
The mean luciferase activity of the positive control (1 nM E2) should be at least 4-fold that of the mean VC on each plate. This criterion is established based on the reliability of the endpoint values from the validation study (historically between four- and 30-fold).
18.
With respect to the quality control of the assay, the fold-induction corresponding to the PC10 value of the concurrent PC (1 nM E2) should be greater than 1+2SD of the fold-induction value (=1) of the concurrent VC. For prioritisation purposes, the PC10 value can be useful to simplify the data analysis required compared to a statistical analysis. Although a statistical analysis provides information on significance, such an analysis is not a quantitative parameter with respect to concentration-based potential, and so is less useful for prioritisation purposes.
Quality criteria for ER antagonist assay
19.
The mean luciferase activity of the spike in control (25 pM E2) should be at least 4-fold that of the mean VC on each plate. This criterion is established based on the reliability of the endpoint values from the validation study.
20.
With respect to the quality control of the assay, relative transcriptional activation (RTA) of 1 nM E2 should be greater than 100 %, RTA of 1μM 4-Hydroxytamoxifen (OHT) should be less than 40,6 % and RTA of 100 μM Digitonin (Dig) should be less than 0 %.
Demonstration of Laboratory Proficiency (see paragraph 14 and Tables 3 and 4 in « ER TA ASSAY COMPONENTS» of this test method).
Vehicle
21.
Dimethyl sulfoxide (DMSO), or appropriate solvent, at the same concentration used for the different positive and negative controls and the test chemicals should be used as the concurrent VC. Test chemicals should be dissolved in a solvent that solubilises that test chemical and is miscible with the cell medium. Water, ethanol (95 % to 100 % purity) and DMSO are suitable vehicles. If DMSO is used, the level should not exceed 0,1 % (v/v). For any vehicle, it should be demonstrated that the maximum volume used is not cytotoxic and does not interfere with assay performance.
Preparation of Test Chemicals
22.
Generally, the test chemicals should be dissolved in DMSO or other suitable solvent, and serially diluted with the same solvent at a common ratio of 1:10 in order to prepare solutions for dilution with media.
Solubility and Cytotoxicity: Considerations for Range Finding.
23.
A preliminary test should be carried out to determine the appropriate concentration range of chemical to be tested, and to ascertain whether the test chemical may have any solubility and cytotoxicity problems. Initially, chemicals are tested up to the maximum concentration of 1 μl/ml, 1 mg/ml, or 1 mM, whichever is the lowest. Based on the extent of cytotoxicity or lack of solubility observed in the preliminary test, the first definite run should test the chemical at log-serial dilutions starting at the maximum acceptable concentration (e.g. 1 mM, 100μM, 10μM, etc.) and the presence of cloudiness or precipitate or cytotoxicity noted. Concentrations in the second, and if necessary third run should be adjusted as appropriate to better characterise the concentration-response curve and to avoid concentrations which are found to be insoluble or to induce excessive cytotoxicity.
24.
For ER agonists and antagonists, the presence of increasing levels of cytotoxicity can significantly alter or eliminate the typical sigmoidal response and should be considered when interpreting the data. Cytotoxicity testing methods that can provide information regarding 80 % cell viability should be used, utilising an appropriate assay based upon laboratory experience.
25.
Should the results of the cytotoxicity test show that the concentration of the test chemical has reduced the cell number by 20 % or more, this concentration should be regarded as cytotoxic, and the concentrations at or above the cytotoxic concentration should be excluded from the evaluation.
Chemical Exposure and Assay Plate Organisation
26.
The procedure for chemical dilutions (Steps-1 and 2) and exposure to cells (Step-3) can be conducted as follows:
Step-1: Each test chemical should be serially diluted in DMSO, or appropriate solvent, and added to the wells of a microtitre plate to achieve final serial concentrations as determined by the preliminary range finding test (typically in a series of, for example 1 mM, 100 μM, 10 μM, 1μM, 100 nM, 10 nM, 1 nM, 100 pM, and 10 pM (10–3-10–11 M)) for triplicate testing.
Step-2: Chemical dilution: First dilute 1,5 μl of the test chemical in the solvent to a volume of 500 μl of media.
Step-3: Chemical exposure of the cells: Add 50 μl of dilution with media (prepared in Step-2) to an assay well containing 104 cells/100 μl/well.
The recommended final volume of media required for each well is 150 μl. Test samples and reference standards can be assigned as shown in Table 3 and Table 4.
Table 3
Example of plate concentration assignment of the reference standards in the assay plate in ER agonist assay
Row
17α-methyltestosterone
Corticosterone
17α-estradiol
E2
1
2
3
4
5
6
7
8
9
10
11
12
A
conc 1 (10 μM)
→
→
100 μM
→
→
1 μM
→
→
10 nM
→
→
B
conc 2 (1 μM)
→
→
10 μM
→
→
100 nM
→
→
1 nM
→
→
C
conc 3 (100 nM)
→
→
1 μM
→
→
10 nM
→
→
100 pM
→
→
D
conc 4 (10 nM)
→
→
100 nM
→
→
1 nM
→
→
10 pM
→
→
E
conc 5 (1 nM)
→
→
10 nM
→
→
100 pM
→
→
1 pM
→
→
F
conc 6 (100 pM)
→
→
1 nM
→
→
10 pM
→
→
0,1 pM
→
→
G
conc 7 (10 pM)
→
→
100 pM
→
→
1 pM
→
→
0,01 pM
→
→
H
VC
→
→
→
→
→
PC
→
→
→
→
→
VC: Vehicle control (0.1 % DMSO); PC: Positive control (1 nM E2)
27.
The reference standards (E2, 17α-estradiol, 17α-methyl testosterone and corticosterone) should be tested in every run (Table 3). PC wells treated with 1 nM of E2 that can produce maximum induction of E2 and VC wells treated with DMSO (or appropriate solvent) alone should be included in each test assay plate (Table 4). If cells from different sources (e.g. different passage number, different lot, etc.) are used in the same experiment, the reference standards should be tested for each cell source.
Table 4
Example of plate concentration assignment of test and plate control chemicals in the assay plate in ER agonist assay
Row
Test Chemical 1
Test Chemical 2
Test Chemical 3
Test Chemical 4
1
2
3
4
5
6
7
8
9
10
11
12
A
conc 1 (10 μM)
→
→
1 mM
→
→
1 μM
→
→
10 nM
→
→
B
conc 2 (1 μM)
→
→
100 μM
→
→
100 nM
→
→
1 nM
→
→
C
conc 3 (100 nM)
→
→
10 μM
→
→
10 nM
→
→
100 pM
→
→
D
conc 4 (10 nM)
→
→
1 μM
→
→
1 nM
→
→
10 pM
→
→
E
conc 5 (1 nM)
→
→
100 nM
→
→
100 pM
→
→
1 pM
→
→
F
conc 6 (100 pM)
→
→
10 nM
→
→
10 pM
→
→
0,1 pM
→
→
G
conc 7 (10 pM)
→
→
1 nM
→
→
1 pM
→
→
0,01 pM
→
→
H
VC
→
→
→
→
→
PC
→
→
→
→
→
VC: Vehicle control (0.1 % DMSO); PC: Positive control (1 nM E2)
Table 5
Example of plate concentration assignment of the reference standards in the assay plate in ER antagonist assay
28.
To evaluate the antagonist activity of chemicals, assay wells located in rows from A to G should be spiked with 25pM E2. The reference standards (Tamoxifen and Flutamide) should be tested in every run. PC wells treated with 1 nM of E2 that can be used as quality control of hERα-HeLa-9903 cell line, VC wells treated with DMSO (or appropriate solvent), 0,1 % DMSO wells treated with DMSO addition to the spiked E2 corresponding to “Spike-in-control”, wells treated with final concentration 1 μM OHT and wells treated with 100 μM Dig should be included in each test assay plate (Table 5). Subsequent assay plate should follow the same plate layout without reference standards wells (Table 6). If cells from different sources (e.g. different passage number, different lot, etc.) are used in the same experiment, the reference standards should be tested for each cell source.
Table 6
Example of plate concentration assignment of test and plate control chemicals in the assay plate in ER antagonist assay
29.
The lack of edge effects should be confirmed, as appropriate, and if edge effects are suspected, the plate layout should be altered to avoid such effects. For example, a plate layout excluding the edge wells can be employed.
30.
After adding the chemicals, the assay plates should be incubated in a 5 % CO2 incubator at 37±1oC for 20-24 hours to induce the reporter gene products.
31.
Special considerations will need to be applied to those compounds that are highly volatile. In such cases, nearby control wells may generate false positives and this should be considered in light of expected and historical control values. In the few cases where volatility may be of concern, the use of “plate sealers” may help to effectively isolate individual wells during testing, and is therefore recommended in such cases.
32.
Repeat definitive tests for the same chemical should be conducted on different days, to ensure independence.
Luciferase assay
33.
A commercial luciferase assay reagent [e.g. Steady-Glo® Luciferase Assay System (Promega, E2510, or equivalent)] or a standard luciferase assay system (e.g. Promega, E1500, or equivalent) can be used for the assay, as long as the acceptability criteria are met. The assay reagents should be selected based on the sensitivity of the luminometer to be used. When using the standard luciferase assay system, Cell Culture Lysis Reagent (e.g. Promega, E1531, or equivalent) should be used before adding the substrate. The luciferase reagent should be applied following the manufacturers’ instructions.
ANALYSIS OF DATA
ER agonist assay
34.
In case of ER agonist assay, to obtain the relative transcriptional activity to PC (1 nM of E2), the luminescence signals from the same plate can be analysed according to the following steps (other equivalent mathematical processes are also acceptable):
Step 1. Calculate the mean value for the VC.
Step 2. Subtract the mean value of the VC from each well value to normalise the data.
Step 3. Calculate the mean for the normalised PC.
Step 4. Divide the normalised value of each well in the plate by the mean value of the normalised PC (PC=100 %).
The final value of each well is the relative transcriptional activity for that well compared to the PC response.
Step 5. Calculate the mean value of the relative transcriptional activity for each concentration group of the test chemical. There are two dimensions to the response: the averaged transcriptional activity (response) and the concentration at which the response occurs (see following section).
EC50, PC50 and PC10 induction considerations
35.
The full concentration-response curve is required for the calculation of the EC50, but this may not always be achievable or practical due to limitations of the test concentration range (for example due to cytotoxicity or solubility problems). However, as the EC50 and maximum induction level (corresponding to the top value of the Hill-equation) are informative parameters, these parameters should be reported where possible. For the calculation of EC50 and maximum induction level, appropriate statistical software should be used (e.g. Graphpad Prism statistical software).If the Hill’s logistic equation is applicable to the concentration response data, the EC50 should be calculated by the following equation (7):
Y=Bottom + (Top-Bottom) / (1+10 exp ((log EC50 -X) x Hill slope)) Where:
X is the logarithm of concentration; and,
Y is the response and Y starts at the Bottom and goes to the Top in a sigmoid curve. Bottom is fixed at zero in the Hill’s logistic equation.
36.
For each test chemical, the following should be provided:
The RPCMax which is the maximum level of response induced by a test chemical, expressed as a percentage of the response induced by 1 nM E2 on the same plate, as well as the PCMax (concentration associated with the RPCMax); and
For positive chemicals, the concentrations that induce the PC10 and, if appropriate, the PC50.
37.
The PCx value can be calculated by interpolating between 2 points on the X-Y coordinate, one immediately above and one immediately below a PCx value. Where the data points lying immediately above and below the PCx value have the coordinates (a,b) and (c,d) respectively, then the PCx value may be calculated using the following equation:
log[PCx] = log[c]+(x-d)/(d-b)
38.
Descriptions of PC values are provided in Figure 1 below.
Figure 1
Example of how to derive PC-values. The PC (1 nM of E2) is included on each assay plate
ER antagonist assay
39.
In case of ER antagonist assay, to obtain the relative transcriptional activity (RTA) to spike in control (25 pM of E2), the luminescence signals from the same plate can be analysed according to the following steps (other equivalent mathematical processes are also acceptable):
Step 1. Calculate the mean value for the VC.
Step 2. Subtract the mean value of the VC from each well value to normalise the data. Step 3. Calculate the mean for the normalised spike in control.
Step 4. Divide the normalised value of each well in the plate by the mean value of the normalised spike in control (spike in control=100 %).
The final value of each well is the relative transcriptional activity for that well compared to the spike in control response.
Step 5. Calculate the mean value of the relative transcriptional activity for each treatment.
IC30 and IC50 induction considerations
40.
For positive chemicals, the concentrations that induce the IC30 and, if appropriate, the IC50 should be provided.
41.
The ICx value can be calculated by interpolating between 2 points on the X-Y coordinate, one immediately above and one immediately below a ICx value. Where the data points lying immediately above and below the ICx value have the coordinates (c,d) and (a,b) respectively, then the ICx value may be calculated using the following equation:
lin ICx = a-(b-(100-x)) (a-c) /(b-d)
Figure 2
Example of how to derive IC-values. The spike in control (25 pM of E2) is included on each assay plate
RTA: relative transcriptional activity
42.
The results should be based on two (or three) independent runs. If two runs give comparable and therefore reproducible results, it is not necessary to conduct a third run. To be acceptable, the results should:
—
Meet the acceptability criteria (see Acceptability criteria para 14-20),
—
Be reproducible.
Data Interpretation Criteria
Table 7
Positive and negative decision criteria in ER agonist assay
Positive
If the RPCMax is obtained that is equal to or exceeds 10 % of the response of the positive control in at least two of two or two of three runs.
Negative
If the RPCMax fails to achieve at least 10 % of the response of the positive control in two of two or two of three runs.
Table 8
Positive and negative decision criteria in ER antagonist assay
Positive
If the IC30 is calculated in at least two of two or two of three runs.
Negative
If the IC30 fails to calculate in two of two or two of three runs.
43.
Data interpretation criteria are shown in Tables 7 and 8. Positive results will be characterised by both the magnitude of the effect and the concentration at which the effect occurs. Expressing results as a concentration at which a 50 % (PC50) or 10 % (PC10) of PC values are reached for the agonist assay, and 50 % (IC50) or 30 % (IC30) of the spike-in control value is inhibited for the antagonist assay, accomplishes both of these goals. However, a test chemical is determined to be positive, if the maximum response induced by the test chemical (RPCMax) is equal to or exceeds 10 % of the response of the PC in at least two of two or two of three runs, while a test chemical is considered negative if the RPCMax fails to achieve at least 10 % of the response of the positive control in two of two or two of three runs.
44.
The calculations of PC10, PC50 and PCMax in ER agonist assay and IC30 and IC50 in ER antagonist assay can be made by using a spreadsheet available with the Test Guideline on the OECD public website (63).
45.
It should be sufficient to obtain PC10 or PC50 and IC30 or IC50 values at least twice. However, should the resulting base-line for data in the same concentration range show variability with an unacceptably high coefficient of variation (CV; %) the data may not be considered reliable and the source of the high variability should be identified. The CV of the raw data triplicates (i.e. luminescence intensity data) of the data points that are used for the calculation of PC10 should be less than 20 %.
46.
Meeting the acceptability criteria indicates the assay system is operating properly, but it does not ensure that any particular run will produce accurate data. Duplicating the results of the first run is the best insurance that accurate data were produced.
47.
In case of ER agonist assay, where more information is required in addition to the screening and prioritisation purposes of this TG for positive test chemicals, particularly for PC10-PC49 chemicals, as well as chemicals suspected to over-stimulate luciferase, it can be confirmed that the observed luciferase-activity is solely an ERα-specific response, using an ERα antagonist (see Appendix 2.1).
TEST REPORT
48.
See paragraph 20 of “ER TA ASSAY COMPONENTS”.
LITERATURE
(1)
OECD (2015). Report of the Inter-Laboratory Validation for Stably Transfected Transactivation Assay to detect Estrogenic and Anti-estrogenic Activity. Environment, Health and Safety Publications, Series on Testing and Assessment (No 225), Organisation for Economic Cooperation and Development, Paris.
(2)
Escande A., et al. (2006). Evaluation of Ligand Selectivity Using Reporter Cell Lines Stably Expressing Estrogen Receptor Alpha or Beta, Biochem. Pharmacol., 71, 1459-1469.
(3)
Kuiper G.G., et al. (1998). Interaction of Estrogenic Chemicals and Phytoestrogens with Estrogen Receptor Beta, Endocrinol., 139, 4252-4263.
(4)
Spaepen M., et al. (1992). Detection of Bacterial and Mycoplasma Contamination in Cell Cultures by Polymerase Chain Reaction, FEMS Microbiol. Lett., 78(1), 89-94.
(5)
Kobayashi H., et al. (1995). Rapid Detection of Mycoplasma Contamination in Cell Cultures by Enzymatic Detection of Polymerase Chain Reaction (PCR) Products, J. Vet. Med. Sci., 57(4), 769- 71.
(6)
Dussurget O. and Roulland-Dussoix D. (1994). Rapid, Sensitive PCR-Based Detection of Mycoplasmas in Simulated Samples of Animal Sera, Appl. Environ. Microbiol., 60(3), 953-9.
(7)
De Lean A., Munson P.J. and Rodbard D. (1978). Simultaneous Analysis of Families of Sigmoidal Curves: Application to Bioassay, Radioligand Assay, and Physiological Dose-Response Curves, Am. J. Physiol., 235, E97-El02.
Appendix 2.1
FALSE POSITIVES: ASSESSMENT OF NON-RECEPTOR MEDIATED LUMINESCENCE SIGNALS
1.
False positives in the ER agonist assay might be generated by non-ER-mediated activation of the luciferase gene, or direct activation of the gene product or unrelated fluorescence. Such effects are indicated by an incomplete or unusual dose-response curve. If such effects are suspected, the effect of an ER antagonist (e.g. 4- hydroxytamoxifen (OHT) at non-toxic concentration) on the response should be examined. The pure antagonist ICI 182780 may not be suitable for this purpose as a sufficient concentration of ICI 182780 may decrease the VC value, and this will affect the data analysis.
2.
To ensure validity of this approach, the following needs to be tested in the same plate:
—
Agonistic activity of the unknown chemical with / without 10 μM of OHT
—
VC (in triplicate)
—
OHT (in triplicate)
—
1 nM of E2 (in triplicate) as agonist PC
—
1 nM of E2 + OHT (in triplicate)
Data interpretation criteria
Note: All wells should be treated with the same concentration of the vehicle.
—
If the agonistic activity of the unknown chemical is NOT affected by the treatment with ER antagonist, it is classified as “Negative”.
—
If the agonistic activity of the unknown chemical is completely inhibited, apply the decision criteria.
—
If the agonistic activity at the lowest concentration is equal to, or is exceeding, PC10 response the unknown chemical is inhibited equal to or exceeding PC10 response. The difference in the responses between the non-treated and treated wells with the ER antagonist is calculated and this difference should be considered as the true response and should be used for the calculation of the appropriate parameters to enable a classification decision to be made.
Data analysis
Check the performance standard.
Check the CV between wells treated under the same conditions.
1.
Calculate the mean of the VC
2.
Subtract the mean of VC from each well value not treated with OHT
3.
Calculate the mean of OHT
4.
Subtract the mean of the VC from each well value treated with OHT
5.
Calculate the mean of the PC
6.
Calculate the relative transcriptional activity of all other wells relative to the PC.
Appendix 2.2
PREPARATION OF SERUM TREATED WITH DEXTRAN COATED CHARCOAL (DCC)
1.
The treatment of serum with dextran-coated charcoal (DCC) is a general method for removal of estrogenic compounds from serum that is added to cell medium, in order to exclude the biased response associated with residual estrogens in serum. 500 ml of fetal bovine serum (FBS) can be treated by this procedure.
Components
2.
The following materials and equipment will be required:
Materials
Activated charcoal
Dextran
Magnesium chloride hexahydrate (MgCl2·6H2O)
Sucrose
1 M HEPES buffer solution (pH 7.4)
Ultrapure water produced from a filter system
Equipment
Autoclaved glass container (size should be adjusted as appropriate) General Laboratory Centrifuge (that can set temperature at 4 °C)
Procedure
3.
The following procedure is adjusted for the use of 50 ml centrifuge tubes:
[Day-1] Prepare dextran-coated charcoal suspension with 1 l of ultrapure water containing 1,5 mM of MgCl2, 0,25 M sucrose, 2,5 g of charcoal, 0,25 g dextran and 5 mM of HEPES and stir it at 4 °C, overnight.
[Day-2] Dispense the suspension in 50 ml centrifuge tubes and centrifuge at 10 000 rpm at 4 °C for 10 minutes. Remove the supernatant and store half of the charcoal sediment at 4 °C for the use on Day-3. Suspend the other half of the charcoal with FBS that has been gently thawed to avoid precipitation, and heat-inactivated at 56 °C for 30 minutes, then transfer into an autoclaved glass container such as an Erlenmeyer flask. Stir this suspension gently at 4 °C, overnight.
[Day-3] Dispense the suspension with FBS into centrifuge tubes for centrifugation at 10 000 rpm at 4 °C for 10 minutes. Collect FBS and transfer into the new charcoal sediment prepared and stored on Day-2. Suspend the charcoal sediment and stir this suspension gently in an autoclaved glass container at 4 °C, overnight.
[Day-4] Dispense the suspension for centrifugation at 10 000 rpm at 4 °C for 10 minutes and sterilise the supernatant by filtration through 0,2 μm sterile filter. This DCC treated FBS should be stored at -20 °C and can be used for up a year.
Appendix 3
VM7LUC ESTROGEN RECEPTOR TRANSACTIVATION ASSAY FOR IDENTIFYING ESTROGEN RECEPTOR AGONISTS AND ANTAGONISTS
INITIAL CONSIDERATIONS AND LIMITATIONS (SEE ALSO GENERAL INTRODUCTION)
1.
This assay uses the VM7Luc4E2 cell line (64). It has been validated by the National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), and the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) (1). The VM7Luc cell lines predominantly express endogenous ERα and a minor amount of endogenous ERβ (2) (3) (4).
2.
This assay is applicable to a wide range of substances, provided they can be dissolved in dimethyl sulfoxide (DMSO; CASRN 67-68-5), do not react with DMSO or the cell culture medium, and are not cytotoxic at the concentrations being tested. If use of DMSO is not possible, another vehicle such as ethanol or water may be used (see paragraph 12). The demonstrated performance of the VM7Luc ER TA (ant)agonist assay suggests that data generated with this assay may inform upon ER mediated mechanisms of action and could be considered for prioritisation of substances for further testing.
3.
This assay is specifically designed to detect hER and hERß-mediated TA by measuring chemiluminescence as the endpoint. Chemiluminescence use in bioassays is widespread because luminescence has a high signal-to-background ratio (10). However, the activity of firefly luciferase in cell-based assays can be confounded by substances that inhibit the luciferase enzyme, causing both apparent inhibition or increased luminescence due to protein stabilisation (10). In addition, in some luciferase-based ER reporter gene assays, non-receptor-mediated luminescence signals have been reported at phytoestrogen concentrations higher than 1 μM due to the over-activation of the luciferase reporter gene (9) (11). While the dose-response curve indicates that true activation of the ER system occurs at lower concentrations, luciferase expression obtained at high concentrations of phytoestrogens or similar compounds suspected of producing phytoestrogen-like over-activation of the luciferase reporter gene needs to be examined carefully in stably transfected ER TA assay systems (see Appendix 2).
4.
The “GENERAL INTRODUCTION” and “ER TA ASSAY COMPONENTS” should be read before using this assay for regulatory purposes. Definitions and abbreviations used in this test method are described in Appendix 1.
PRINCIPLE OF THE ASSAY (SEE ALSO GENERAL INTRODUCTION)
5.
The assay is used to indicate ER ligand binding, followed by translocation of the receptor-ligand complex to the nucleus. In the nucleus, the receptor-ligand complex binds to specific DNA response elements and transactivates the reporter gene (luc), resulting in the production of luciferase and the subsequent emission of light, which can be quantified using a luminometer. Luciferase activity can be quickly and inexpensively evaluated with a number of commercially available kits. The VM7Luc ER TA utilises an ER responsive human breast adenocarcinoma cell line, VM7, which has been stably transfected with a firefly luc reporter construct under control of four estrogenresponse elements placed upstream of the mouse mammary tumour virus promoter (MMTV), to detect substances with in vitro ER agonist or antagonist activity. This MMTV promoter exhibits only minor cross-reactivity with other steroid and non-steroid hormones (8). Criteria for data interpretation are described in detail in paragraph 41. Briefly, a positive response is identified by a concentration-response curve containing at least three points with non-overlapping error bars (mean ± SD), as well as a change in amplitude (normalised relative light unit [RLU]) of at least 20 % of the maximal value for the reference standard (17-estradiol [E2; CASRN 50-28-2] for the agonist assay, raloxifene HCl [Ral; CASRN 84449-90-1]/E2 for the antagonist assay).
PROCEDURE
Cell Line
6.
The stably transfected VM7Luc4E2 cell line should be used for the assay. The cell line is currently only available with a technical licensing agreement from the University of California, Davis, California, USA (65), and from Xenobiotic Detection Systems Inc., Durham, North Carolina, USA (66).
Stability of the Cell Line
7.
To maintain the stability and integrity of the cell line, the cells should be grown for more than one passage from the frozen stock in cell maintenance media (see paragraph 9). Cells should not be cultured for more than 30 passages. For the VM7Luc4E2 cell line, 30 passages will be approximately three months.
Cell Culture and Plating Conditions
8.
Procedures specified in the Guidance on Good Cell Culture Practice (5) (6) should be followed to assure the quality of all materials and methods in order to maintain the integrity, validity, and reproducibility of any work conducted.
9.
VM7Luc4E2 cells are maintained in RPMI 1640 medium supplemented with 0,9 % Pen-Strep and 8.0 % fetal bovine serum (FBS) in a dedicated tissue culture incubator at 37oC ± 1oC, 90 % ± 5 % humidity, and 5,0 % ± 1 % CO2/air.
10.
Upon reaching ~80 % confluence, VM7Luc4E2 cells are subcultured and conditioned to an estrogen-free environment for 48 hours prior to plating the cells in 96-well plates for exposure to test chemicals and analysis of estrogen dependent induction of luciferase activity. The estrogen-free medium (EFM) contains Dulbecco’s Modification of Eagle’s Medium (DMEM) without phenol red, supplemented with 4.5 % charcoal/dextran-treated FBS, 1,9 % L-glutamine, and 0,9 % Pen-Strep. All plasticware should be free of estrogenic activity [see detailed protocol (7)].
Acceptability Criteria
11.
Acceptance or rejection of a test is based on the evaluation of reference standard and control results from each experiment conducted on a 96-well plate. Each reference standard is tested in multiple concentrations and thereare multiple samples of each reference and control concentration. Results are compared to quality controls (QC) for these parameters that were derived from the agonist and antagonist historical databases generated by each laboratory during the demonstration of proficiency. The historical databases are updated with reference standard and control values on a continuous basis. Changes in equipment or laboratory conditions may necessitate generation of updated historical databases.
Agonist Test
Range Finder Test
•
Induction: Plate induction should be measured by dividing the average highest E2 reference standard relative light unit (RLU) value by the average DMSO control RLU value. Five-fold induction is usually achieved, but for purpose of acceptance, induction should be greater than or equal to four-fold.
•
DMSO control results: Solvent control RLU values should be within 2.5 times the standard deviation of the historical solvent control mean RLU value.
•
An experiment that fails either acceptance criterion should be discarded and repeated.
Comprehensive Test
It includes acceptability criteria from the agonist range finder test and the following:
•
Reference standard results: The E2 reference standard concentration-response curve should be sigmoidal in shape and have at least three values within the linear portion of the concentration-response curve.
•
Positive control results: Methoxychlor control RLU values should be greater than the DMSO mean plus three times the standard deviation from the DMSO mean.
•
An experiment that fails any single acceptance criterion should be discarded and repeated.
Antagonist Test
Range Finder Test
•
Reduction: Plate reduction is measured by dividing the average highest Ral/E2 reference standard RLU value by the average DMSO control RLU value. Five-fold reduction is usually achieved, but for the purposes of acceptance, reduction should be greater than or equal to three-fold.
•
E2 control results: E2 control RLU values should be within 2.5 times the standard deviation of the historical E2 control mean RLU value.
•
DMSO control results: DMSO control RLU values should be within 2.5 times the standard deviation of the historical solvent control mean RLU value.
•
An experiment that fails any single acceptance criterion will be discarded and repeated.
Comprehensive Test
It includes acceptance criteria from the antagonist range finder test and the following:
•
Reference standard results: The Ral/E2 reference standard concentration-response curve should be sigmoidal in shape and have at least three values within the linear portion of the concentration-response curve.
•
Positive control results: Tamoxifen/E2 control RLU values should be less than the E2 control mean minus three times the standard deviation from the E2 control mean.
•
An experiment that fails any single acceptance criterion will be discarded and repeated.
Reference Standards, Positive, and Vehicle Controls
Vehicle Control (Agonist and Antagonist Assays)
12.
The vehicle that is used to dissolve the test chemicals should be tested as a vehicle control. The vehicle used during the validation of the VM7Luc ER TA assay was 1 % (v/v) dimethylsulfoxide (DMSO, CASRN 67-68-5) (see paragraph 24). If a vehicle other than DMSO is used, all reference standards, controls, and test chemicals should be tested in the same vehicle, if appropriate.
Reference Standard (Agonist Range Finder)
13.
The reference standard is E2 (CASRN 50-28-2). For range finder testing, the reference standard is comprised of a serial dilution of four concentrations of E2 (1.84 × 10-10, 4,59 × 10-11, 1,15 × 10-11 and 2,87 × 10-12 M), with each concentration tested in duplicate wells.
Reference Standard (Agonist Comprehensive)
14.
E2 for comprehensive testing is comprised of a 1:2 serial dilution consisting of 11 concentrations (ranging from 3.67 × 10-10 to 3,59 × 10-13 M) of E2 in duplicate wells.
Reference Standard (Antagonist Range Finder)
15.
The reference standard is a combination of Ral (CASRN 84449-90-1) and E2 (CASRN 50-28-2). Ral/E2 for range finder testing is comprised of a serial dilution of three concentrations of Ral (3.06 10-9, 7,67 10-10, and 1,92 10-10M) plus a fixed concentration (9.18 × 10-11 M) of E2 in duplicate wells.
Reference Standard (Antagonist Comprehensive)
16.
Ral/E2 for comprehensive testing is comprised of a 1:2 serial dilution of Ral (ranging from 2,45 10-8 to 9.57 10-11M) plus a fixed concentration (9.18 × 10-11 M) of E2 consisting of nine concentrations of Ral/E2 in duplicate wells.
Weak Positive Control (Agonist)
17.
The weak positive control is 9,06 10-6 M p,p'-methoxychlor (methoxychlor; CASRN 72-43-5) in EFM.
Weak Positive Control (Antagonist)
18.
The weak positive control consists of tamoxifen (CASRN 10540-29-1) 3,36 10-6 M with 9,18 × 10-11 M E2 in EFM.
E2 Control (Antagonist Assay Only)
19.
The E2 control is 9,18 × 10-11 M E2 in EFM and used as a base line negative control.
Fold-Induction (Agonist)
20.
The induction of luciferase activity of the reference standard (E2) is measured by dividing the average highest E2 reference standard RLU value by the average DMSO control RLU value, and the result should be greater than four-fold.
Fold-Reduction (Antagonist)
21.
The mean luciferase activity of the reference standard (Ral/E2) is measured by dividing the average highest Ral/E2 reference standard RLU value by the average DMSO control RLU value and should be greater than three-fold.
Demonstration of Laboratory Proficiency (see paragraph 14 and Tables 3 and 4 in “ER TA ASSAY COMPONENTS” of this test method)
Vehicle
22.
Test chemicals should be dissolved in a solvent that solubilises the test chemical and is miscible with the cell medium. Water, ethanol (95 % to 100 % purity) and DMSO are suitable vehicles. If DMSO is used, the level should not exceed 1 % (v/v). For any vehicle, it should be demonstrated that the maximum volume used is not cytotoxic and does not interfere with the assay performance. Reference standards and controls are dissolved in 100 % solvent and then diluted down to appropriate concentrations in EFM.
Preparation of Test chemicals
23.
The test chemicals are dissolved in 100 % DMSO (or appropriate solvent), and then diluted down to appropriate concentrations in EFM. All test chemicals should be allowed to equilibrate to room temperature before being dissolved and diluted. Test chemical solutions should be prepared fresh for each experiment. Solutions should not have noticeable precipitate or cloudiness. Reference standard and control stocks may be prepared in bulk; however, final reference standard, control dilutions and test chemicals should be freshly prepared for each experiment and used within 24 hours of preparation.
Solubility and Cytotoxicity: Considerations for Range Finding
24.
Range finder testing consists of seven point - 1:10 serial dilutions run in duplicate. Initially, test chemicals are tested up to the maximum concentration of 1 mg/ml (~1 mM) for agonist testing and 20 μg/ml (~10 M) for antagonist testing. Range finder experiments are used to determine the following:
—
Test chemical starting concentrations to be used during comprehensive testing
—
Test chemical dilutions (1:2 or 1:5) to be used during comprehensive testing
25.
An assessment of cell viability/cytotoxicity is included in the agonist and antagonist assay protocols (7) and is incorporated into range finder and comprehensive testing. The cytotoxicity method that was used to assess cell viability during the validation of the VM7Luc ER TA (1) was a scaled qualitative visual observation method; however, a quantitative method for the determination of cytotoxicity can be used (see protocol (7)). Data from test chemical concentrations that cause more than 20 % reduction in viability cannot be used.
Test chemical Exposure and Assay Plate Organisation
26.
Cells are counted and plated into 96-well tissue culture plates (2 × 105 cells per well) in EFM and incubated for 24 hours to allow the cells to attach to the plate. The EFM is removed and replaced with test and reference chemicals in EFM and incubated for 19-24 hours. Special considerations will need to be applied to those substances that are highly volatile since nearby control wells may generate false positive results. In such cases, “plate sealers” may help to effectively isolate individual wells during testing, and are therefore recommended.
Range Finder Tests
27.
Range finder testing uses all wells of the 96-well plate to test up to six test chemicals as seven point 1:10 serial dilutions in duplicate (see Figures 1 and 2).
—
Agonist range finder testing uses four concentrations of E2 in duplicate as the reference standard and four replicate wells for the DMSO control.
—
Antagonist range finder testing uses three concentrations of Ral/E2 with 9,18 × 10-11 M E2 in duplicate as the reference standard, with three replicate wells for the E2 and DMSO controls.
Figure 1
Agonist Range Finder Test 96-well Plate Layout
Abbreviations: E2-1 to E2-4 = concentrations of the E2 reference standard (from high to low); TC1-1 to TC1-7 = concentrations (from high to low) of test chemical 1 (TC1); TC2-1 to TC2-7 = concentrations (from high to low) of test chemical 2 (TC2); TC3-1 to TC3-7 = concentrations (from high to low) of test chemical 3 (TC3); TC4-1 to TC4-7 = concentrations (from high to low) of test chemical 4 (TC4); TC5-1 to TC5-7 = concentrations (from high to low) of test chemical 5 (TC5); TC6-1 to TC6-7 = concentrations (from high to low) of test chemical 6 (TC6); VC = vehicle control (DMSO [1 % v/v EFM.]).
Figure 2
Antagonist Range Finder Test 96-well Plate Layout
Abbreviations: E2 = E2 control; Ral-1 to Ral-3 = concentrations of the Raloxifene/E2 reference standard (from high to low); TC1-1 to TC1-7 = concentrations (from high to low) of test chemical 1 (TC1); TC2-1 to TC2-7 = concentrations (from high to low) of test chemical 2 (TC2); TC3-1 to TC3-7 = concentrations (from high to low) of test chemical 3 (TC3); TC4-1 to TC4-7 = concentrations (from high to low) of test chemical 4 (TC4); TC5-1 to TC5-7 = concentrations (from high to low) of test chemical 5 (TC5); TC6-1 to TC6-7 = concentrations (from high to low) of test chemical 6 (TC6); VC = vehicle control (DMSO [1 % v/v EFM.]).
Note: All test chemicals are tested in the presence of 9,18 × 10 11 M E2.
28.
The recommended final volume of media required for each well is 200 μl. Only use test plates in which the cells in all wells give a viability of 80 % and above.
29.
Determination of starting concentrations for comprehensive agonist testing is described in depth in the agonist protocol (7). Briefly, the following criteria are used:
—
If there are no points on the test chemical concentration curve that are greater than the mean plus three times the standard deviation of the DMSO control, comprehensive testing will be conducted using an 11-point 1:2 serial dilution starting at the maximum soluble concentration.
—
If there are points on the test chemical concentration curve that are greater than the mean plus three times the standard deviation of the DMSO control, the starting concentration to be used for the 11-point dilution scheme in comprehensive testing should be one log higher than the concentration giving the highest adjusted RLU value in the range finder. The 11-point dilution scheme will be based on either 1:2 or 1:5 dilutions according to the following criteria:
An 11-point 1:2 serial dilution should be used if the resulting concentration range will encompass the full range of responses based on the concentration response curve generated in the range finder test. Otherwise, use a 1:5 dilution.
—
If a test chemical exhibits a biphasic concentration response curve in the range finder test, both phases should also be resolved in comprehensive testing.
30.
Determination of starting concentrations for comprehensive antagonist testing is described in depth in the antagonist protocol (7). Briefly, the following criteria are used:
—
If there are no points on the test chemical concentration curve that are less than the mean minus three times the standard deviation of the E2 control, comprehensive testing will be conducted using an 11-point 1:2 serial dilution starting at the maximum soluble concentration.
—
If there are points on the test chemical concentration curve that are less than the mean minus three times the standard deviation of the E2 control, the starting concentration to be used for the 11-point dilution scheme in comprehensive testing should be one of the following:
—
The concentration giving the lowest adjusted RLU value in the range finder
—
The maximum soluble concentration (See antagonist protocol (7), Figure 14-2)
—
The lowest cytotoxic concentration (See antagonist protocol (7), Figure 14-3 for a related example).
—
The 11-point dilution scheme will be based on either a 1:2 or 1:5 serial or dilution according to the following criteria:
An 11-point 1:2 serial dilution should be used if the resulting concentration range will encompass the full range of responses based on the concentration response curve generated in the range finder test. Otherwise a 1:5 dilution should be used.
Comprehensive Tests
31.
Comprehensive testing consists of 11-point serial dilutions (either 1:2 or 1:5 serial dilutions based on the starting concentration for comprehensive testing criteria) with each concentration tested in triplicate wells of the 96-well plate (see Figures 3 and 4).
—
Agonist comprehensive testing uses 11 concentrations of E2 in duplicate as the reference standard. Four replicate wells for the DMSO control and four replicate wells for the methoxychlor control (9.06 × 10-6 M) are included on each plate.
—
Antagonist comprehensive testing uses nine concentrations of Ral/E2 with 9,18 × 10-11 M E2 in duplicate as the reference standard, with four replicate wells for the E2 9,18 10-11 M control, four replicate wells for DMSO controls, and four replicate wells for tamoxifen 3,36 × 10-6M.
Figure 3
Agonist Comprehensive Test 96-well Plate Layout
Abbreviations: TC1-1 to TC1-11 = concentrations (from high to low) of test chemical 1; TC2-1 to TC2-11 = concentrations (from high to low) of test chemical 2; E2-1 to E2-11 = concentrations of the E2 reference standard (from high to low); Meth = p,p’ methoxychlor weak positive control; VC = DMSO (1 % v/v) EFM vehicle control
Figure 4
Antagonist Comprehensive Test 96-well Plate Layout
Abbreviations: E2 = E2 control; Ral-1 to Ral-9 = concentrations of the Raloxifene/E2 reference standard (from high to low); Tam = Tamoxifen/E2 weak positive control; TC1-1 to TC1-11 = concentrations (from high to low) of test chemical 1 (TC1); TC2-1 to TC2-11 = concentrations (from high to low) of test chemical 2 (TC2); VC = vehicle control (DMSO [1 % v/v EFM.]).
Note: As noted, all reference and test wells contain a fixed concentration of E2 (9.18 × 10-11 M)
32.
Repeat comprehensive tests for the same chemical should be conducted on different days, to ensure independence. At least two comprehensive tests should be conducted. If the results of the tests contradict each other (e.g. one test is positive, the other negative), or if one of the tests is inadequate, a third additional test should be conducted.
Measure of Luminescence
33.
Luminescence is measured in the range of 300 to 650 nm, using an injecting luminometer and with software that controls the injection volume and measurement interval (7). Light emission from each well is expressed as RLU per well.
ANALYSIS OF DATA
EC50 /IC50 determination
34.
The EC50 value (half maximal effective concentration of a test chemical [agonists]) and the IC50 value (half maximal inhibitory concentration of a test chemical [antagonists]) are determined from the concentration-response data. For test chemicals that are positive at one or more concentrations, the concentration of test chemical that causes a half-maximal response (IC50 or EC50) is calculated using a Hill function analysis or an appropriate alternative. The Hill function is a four-parameter logistic mathematical model relating the test chemical concentration to the response (typically following a sigmoidal curve) using the equation below:
Where:
Y= response (i.e. RLUs);
X= the logarithm of concentration;
Bottom= the minimum response;
Top= the maximum response;
lg EC50 (or lg IC50)= the logarithm of X as the response midway between Top and Bottom;
Hillslope= the steepness of the curve.
The model calculates the best fit for the Top, Bottom, Hillslope, and IC50 and EC50 parameters. For the calculation of EC50 and IC50 values, appropriate statistical software should be used (e.g. Graphpad PrismR statistical software).
Determination of Outliers
35.
Good statistical judgment could be facilitated by including (but not limited to) the Q-test (see agonist and antagonist protocols (7) for determining “unusable” wells that will be excluded from the data analysis.
36.
For E2 reference standard replicates (sample size of two), any adjusted RLU value for a replicate at a given concentration of E2 is considered an outlier if its value is more than 20 % above or below the adjusted RLU value for that concentration in the historical database.
Collection and Adjustment of Luminometer Data for Range Finder Testing
37.
Raw data from the luminometer should be transferred to a spreadsheet template designed for the assay. It should be determined whether there are outlier data points that need to be removed. (See Test Acceptance Criteria for parameters that are determined in the analyses.) The following calculations should be performed:
Agonist
Step 1
Calculate the mean value for the DMSO vehicle control (VC).
Step 2
Subtract the mean value of the DMSO VC from each well value to normalise the data.
Step 3
Calculate the mean fold induction for the reference standard (E2).
Step 4
Calculate the mean EC50 value for the test chemicals.
Antagonist
Step 1
Calculate the mean value for the DMSO VC.
Step 2
Subtract the mean value of the DMSO VC from each well value to normalise the data.
Step 3
Calculate the mean fold reduction for the reference standard (Ral/E2).
Step 4
Calculate the mean value for the E2 reference standard.
Step 5
Calculate the mean IC50 value for the test chemicals.
Collection and Adjustment of Luminometer Data for Comprehensive Testing
38.
Raw data from the luminometer should be transferred to a spreadsheet template designed for the assay. It should be determined whether there are outlier data points that need to be removed. (See Test Acceptance Criteria for parameters that are determined in the analyses.) The following calculations are performed:
Agonist
Step 1
Calculate the mean value for the DMSO VC.
Step 2
Subtract the mean value of the DMSO VC from each well value to normalise the data.
Step 3
Calculate the mean fold induction for the reference standard (E2).
Step 4
Calculate the mean EC50 value for E2 and the test chemicals.
Step 5
Calculate the mean adjusted RLU value for methoxychlor.
Antagonist
Step 1
Calculate the mean value for the DMSO VC.
Step 2
Subtract the mean value of the DMSO VC from each well value to normalise the data.
Step 3
Calculate the mean fold induction for the reference standard (Ral/E2).
Step 4
Calculate the mean IC50 value for Ral/E2 and the test chemicals.
Step 5
Calculate the mean adjusted RLU value for tamoxifen.
Step 6
Calculate the mean value for the E2 reference standard.
Data Interpretation Criteria
39.
The VM7Luc ER TA is intended as part of a weight of evidence approach to help prioritise substances for ED testing in vivo. Part of this prioritisation procedure will be the classification of the test chemical as positive or negative for either ER agonist or antagonist activity. The positive and negative decision criteria used in the VM7Luc ER TA validation study are described in Table 1.
Table 1
Positive and Negative Decision Criteria
AGONIST ACTIVITY
Positive
—
All test chemicals classified as positive for ER agonist activity should have a concentration–response curve consisting of a baseline, followed by a positive slope, and concluding in a plateau or peak. In some cases, only two of these characteristics (baseline–slope or slope–peak) may be defined.
—
The line defining the positive slope should contain at least three points with non-overlapping error bars (mean ± SD). Points forming the baseline are excluded, but the linear portion of the curve may include the peak or first point of the plateau.
—
A positive classification requires a response amplitude, the difference between baseline and peak, of at least 20 % of the maximal value for the reference standard, E2 (i.e. 2000 RLUs or more when the maximal response value of the reference standards [E2] is adjusted to 10,000 RLUs).
—
If possible, an EC50 value should be calculated for each positive test chemical.
Negative
The average adjusted RLU for a given concentration is at or below the mean DMSO control RLU value plus three times the standard deviation of the DMSO RLU.
Inadequate
Data that cannot be interpreted as valid for showing either the presence or absence of activity because of major qualitative or quantitative limitations are considered inadequate and cannot be used to determine whether the test chemical is positive or negative. Chemicals should be retested.
ANTAGONIST ACTIVITY
Positive
—
Test chemical data produce a concentration-response curve consisting of a baseline, which is followed by a negative slope.
—
The line defining the negative slope should contain at least three points with non-overlapping error bars; points forming the baseline are excluded but the linear portion of the curve may include the first point of the plateau.
—
There should be at least a 20 % reduction in activity from the maximal value for the reference standard, Ral/E2 (i.e. 8000 RLU or less when the maximal response value of the reference standard [Ral/E2] is adjusted to 10,000 RLUs).
—
The highest non-cytotoxic concentrations of the test chemical should be less than or equal to 1x10-5 M.
—
If possible, an IC50 value should be calculated for each positive test chemical.
Negative
All data points are above the EC80 value (80 % of the E2 response, or 8000 RLUs), at concentrations less than 1,0 10-5 M.
Inadequate
Data that cannot be interpreted as valid for showing either the presence or absence of activity because of major qualitative or quantitative limitations are considered inadequate and cannot be used to determine whether the test chemical is positive or negative. Chemical should be retested.
40.
Positive results will be characterised by both the magnitude of the effect and the concentration at which the effect occurs, where possible. Examples of positive, negative and inadequate data are shown in Figures 5 and 6.
Figure 5
Agonist Examples: Positive, Negative and Inadequate Data
Dashed line indicates 20 % of E2 response, 2000 adjusted and normalised RLUs.
Figure 6
Antagonist Examples: Positive, Negative, and Inadequate Data
Dashed line indicates 80 % of Ral/E2 response, 8000 adjusted and normalised RLUs.
Solid line indicates 1,00 10-5 M. For a response to be considered positive, it should be below the 8000 RLU line, and at concentrations less than 1,00 10-5 M.
Asterisked concentrations in the meso-hexestrol graph indicate viability scores of "2" or greater.
The test results for meso-hexestrol are considered inadequate data because the only response that is below 8,000 RLU occurs at 1,00 10-5 M.
41.
The calculations of EC50 and IC50 can be made using a four-parameter Hill Function (see agonist protocol and antagonist protocol for more details (7)). Meeting the acceptability criteria indicates the system is operating properly, but it does not ensure that any particular run will produce accurate data. Duplicating the results of the first run is the best assurance that accurate data were produced (see paragraph 19 of “ER TA ASSAY COMPONENTS”).
TEST REPORT
42.
See paragraph 20 of “ER TA ASSAY COMPONENTS”.
LITERATURE
(1)
ICCVAM. (2011). ICCVAM Test Method Evaluation Report on the LUMI-CELL® ER (BG1Luc ER TA) Test Method: An In Vitro Method for Identifying ER Agonists and Antagonists, National Institute of Environmental Health Sciences: Research Triangle Park, NC.
(2)
Monje P., Boland R. (2001). Subcellular Distribution of Native Estrogen Receptor α and β Isoforms in Rabbit Uterus and Ovary, J. Cell Biochem., 82(3): 467-479.
(3)
Pujol P., et al. (1998). Differential Expression of Estrogen Receptor-Alpha and -Beta Messenger RNAs as a Potential Marker of Ovarian Carcinogenesis, Cancer Res., 58(23): 5367-5373.
(4)
Weihua Z., et al. (2000). Estrogen Receptor (ER) β, a Modulator of ERα in the Uterus, Proceedings of the National Academy of Sciences of the United States of America 97(11): 936-5941.
(5)
Balls M., et al. (2006). The Importance of Good Cell Culture Practice (GCCP), ALTEX, 23(Suppl): p. 270-273.
(6)
Coecke S., et al. (2005). Guidance on Good Cell Culture Practice: a Report of the Second ECVAM Task Force on Good Cell Culture Practice, Alternatives to Laboratory Animals, 33: p. 261-287.
(7)
ICCVAM (2011). ICCVAM Test Method Evaluation Report, The LUMI-CELL® ER (BG1Luc ER TA) Test Method: An In Vitro Assay for Identifying Human Estrogen Receptor Agonist and Antagonist Activity of Chemicals, NIH Publication No 11-7850.
(8)
Rogers J.M., Denison M.S. (2000). Recombinant Cell Bioassays for Endocrine Disruptors: Development of a Stably Transfected Human Ovarian Cell Line for the Detection of Estrogenic and Anti-Estrogenic Chemicals, In Vitro Mol. Toxicol.,13(1):67-82.
(9)
Escande A., et al. (2006). Evaluation of Ligand Selectivity Using Reporter Cell Lines Stably Expressing Estrogen Receptor Alpha or Beta, Biochem. Pharmacol., 71(10):1459-69.
(10)
Thorne N., Inglese J., Auld D.S. (2010). Illuminating Insights into Firefly Luciferase and Other Bioluminescent Reporters Used in Chemical Biology, Chemistry and Biology,17(6):646-57.
(11)
Kuiper G.G, et al. (1998). Interaction of Estrogenic Chemicals and Phytoestrogens with Estrogen Receptor Beta, Endocrinology,139(10):4252-63.
(12)
Geisinger, et al. (1989). Characterization of a human ovarian carcinoma cell line with estrogen and progesterone receptors, Cancer 63, 280-288.
(13)
Baldwin, et al. (1998). BG-1 ovarian cell line: an alternative model for examining estrogen-dependent growth in vitro, In Vitro Cell. Dev. Biol. – Animal, 34, 649-654.
(14)
Li, Y., et al. (2014). Research resource: STR DNA profile and gene expression comparisons of human BG-1 cells and a BG-1/MCF-7 clonal variant, Mol. Endo. 28, 2072-2081.
(15)
Rogers, J.M. and Denison, M.S. (2000). Recombinant cell bioassays for endocrine disruptors:development of a stably transfected human ovarian cell line for the detection of estrogenicand anti-estrogenic chemicals, In Vitro & Molec. Toxicol. 13, 67-82.
Appendix 4
STABLY TRANSFECTED HUMAN ESTROGEN RECEPTOR-Α TRANSACTIVATION ASSAY FOR DETECTION OF ESTROGENIC AGONIST AND ANTAGONIST ACTIVITY OF CHEMICALS USING THE ERΑ CALUX CELL LINE
INITIAL CONSIDERATIONS AND LIMITATIONS (SEE ALSO GENERAL INTRODUCTION)
1.
The ERα CALUX transactivation assay uses the human U2OS cell line to detect estrogenic agonist and antagonist activity mediated through human estrogen receptor alpha (hERα). The validation study of the stably transfected ERα CALUX bioassay by BioDetection Systems BV (Amsterdam, the Netherlands) demonstrated the relevance and reliability of the assay for its intended purpose (1). The ERα CALUX cell line expresses stably transfected human ERα only (2) (3).
2.
This assay is specifically designed to detect hERα-mediated transactivation by measuring bioluminescence as the endpoint. The use of bioluminescence is commonly used in bioassays because of the high signal-to-noise ratio (4).
3.
Phytoestrogen concentrations higher than 1 μM have been reported to over-activate the luciferase reporter gene, resulting in non-receptor-mediated luminescence (5) (6) (7). Therefore, higher concentrations of phytoestrogens or other similar compounds that can over-activate the luciferase expression, have to be examined carefully in stably transfected ER transactivation assays (see Appendix 2).
4.
The “GENERAL INTRODUCTION” and “ER TA ASSAY COMPONENTS” should be read before using this assay for regulatory purposes. Definitions and abbreviations used in this test method are described in Appendix 1.
PRINCIPLE OF THE ASSAY (SEE ALSO GENERAL INTRODUCTION)
5.
The bioassay is used to assess ER ligand binding and subsequent translocation of the receptor-ligand complex to the nucleus. In the nucleus, the receptor-ligand complex binds specific DNA response elements and transactivates a firefly luciferase reporter gene, resulting in increased cellular expression of the luciferase enzyme. Following the addition of the luciferase substrate luciferine, the luciferine is transformed into a bioluminescent product. The light produced can easily be detected and quantified using a luminometer.
6.
The test system utilises stably transfected ERα CALUX cells. ERα CALUX cells originated from the human osteoblastic osteosarcoma U2OS cell line. Human U2OS cells were stably transfected with 3xHRE-TATA-Luc and pSG5-neo-hERα using the calcium phosphate co-precipitation method. The U2OS cell line was selected as the best candidate to serve as the estrogen- (and other steroid hormone) responsive reporter cell line, based on the observation that the U2OS cell line showed little or no endogenous receptor activity. The absence of endogenous receptors was assessed using luciferase reporter plasmids only, showing no activity when receptor ligands were added. Furthermore, this cell line supported strong hormone-mediated responses when cognate receptors were transiently introduced (2) (3) (8).
7.
Testing chemicals for estrogenic or anti-estrogenic activity using the ERα CALUX cell line include a prescreen run and comprehensive runs. During the prescreen run, the solubility, cytotoxicity and a refined concentration-range of test chemicals for comprehensive testing are determined. During the comprehensive runs, the refined concentration-ranges of test chemicals are tested in the ERα CALUX bioassays followed by the classification of the test chemicals for agonism or antagonism.
8.
Criteria for data interpretation are described in detail in paragraph 59. Briefly, a test chemical is considered positive for agonism in case at least two consecutive concentrations of the test chemical show a response that is equal or higher than 10 % of the maximum response of the reference standard 17β-estradiol (PC10). A test chemical is considered positive for antagonism in case at least two consecutive concentrations of the test chemical show a response that is equal or lower than 80 % of the maximum response of the reference standard tamoxifen (PC80).
PROCEDURE
Cell lines
9.
The stably transfected U2OS ERα CALUX cell line should be used for the assay. The cell line can be obtained from BioDetection Systems BV, Amsterdam, the Netherlands with a technical licensing agreement.
10.
Only mycoplasma free cell cultures should be used. Cell batches used should either be certified negative for mycoplasma contamination, or a mycoplasma test should be performed before use. RT-PCR (Real Time Polymerase Chain Reaction) should be used for sensitive detection of mycoplasma infection (9).
Stability of the cell line
11.
To maintain the stability and integrity of the CALUX cells, the cells should be stored in liquid nitrogen (-800C). Following thawing of cells to start a new culture, cells should be sub-cultured at least twice before being used to assess the estrogenic agonist and antagonist activity of chemicals. Cells should not be sub-cultured for more than 30 passages.
12.
To monitor the stability of the cell line over time, the responsiveness of the agonistic and antagonistic test system should be verified by evaluating the EC50 or IC50 of the reference standard. In addition, the relative induction of the positive control sample (PC) and the negative control sample (NC) should be monitored. The results should be in agreement with the acceptance criteria for the agonistic (Table 3C) or antagonistic ERα CALUX bioassay (Table 4C). The reference standards, positive and negative controls are given in Table 1 and Table 2 for the agonistic and antagonistic mode respectively.
Cell Culture and plating conditions
13.
The U2OS cells should be cultured in growth medium (DMEM/F12 (1:1) medium with phenol red as pH indicator, supplemented with fetal bovine serum (7.5 %), non-essential amino acids (1 %), 10 Units/ml of penicillin, streptomycin and geneticin (G-418) as selection marker). Cells should be placed in a CO2 incubator (5 % CO2) at 370C and 100 % humidity. When cells reach an 85-95 % confluency, cells should either be subcultured or prepared for seeding in 96-well microtiter plates. In case of the latter, cells should be resuspended at 1x105 cells/ml in estrogen free assay medium (DMEM/F12 (1:1) medium without phenol red, supplemented with Dextran-Coated Charcoal treated fetal bovine serum (5 % v/v), non-essential amino acids (1 % v/v), 10 Units/ml of penicillin and streptomycin) and plated into the wells of the 96-well microtiterplates (100 μl of homogenised cell suspension). Cells should be pre-incubated in a CO2 incubator (5 % CO2, 370C, 100 % humidity) for 24 hours prior to exposure. Plastic ware should be estrogen free.
Acceptability criteria
14.
Agonistic and antagonistic activities of the test chemical(s) are tested in test series. A test series consists of a maximum of 6 microtiter plates. Each test series contains at least 1 full series of dilutions of a reference standard, a positive control sample, a negative control sample and solvent controls. Figures 1 and 2 give the plate setup for agonistic and antagonistic tests series.
15.
Each dilution of the reference standards, test chemicals, all solvent controls, and positive and negative controls should be analysed in triplicate. Each of the triplicate analyses should fulfil the requirements given in Table 3A and Table 4A.
16.
A complete series of dilutions of the reference standard (17β-estradiol for agonism; tamoxifen for antagonism) is measured on the first plate in each test series. To be able to compare the analysis results of the remaining 5 microtiter plates with the first microtiter plate containing the complete concentration-response curve of the reference standard, all plates should contain 3 control samples: solvent control, the highest concentration of the reference standard tested, and the approximate EC50 (agonism) or IC50 (antagonism) concentration of the reference standard. The ratio of the average control samples on the first plate and the remaining 5 plates should fulfil the requirements as given in Table 3C (agonism) or Table 4C (antagonism).
17.
For each of the microtiter plates within a test series, the z-factor is calculated (10). The z-factor should be calculated using the responses at the highest and lowest concentration of the reference standard. A microtiter plate is considered valid in case it fulfils the requirements as stated in Table 3C (agonism) or Table 4C (antagonism).
18.
The reference standard should demonstrate a sigmoidal dose-response curve. The EC50 or IC50 derived from the response of the series of dilutions of the reference standard, should fulfil the requirements as indicated in Table 3C (agonism) or Table 4C (antagonism).
19.
Each test series should contain a positive control and negative control sample. The calculated relative induction of both the positive and negative control sample should fulfil the requirements as indicated in Table 3C (agonism) or Table 4C (antagonism).
20.
During all measurements, the induction factor of the highest concentration of the reference standard should be measured by dividing the average highest 17β-estradiol reference standard relative light unit (RLU) response by the average reference solvent control RLU response. This induction factor should fulfil the minimum requirements for the fold induction as indicated in Table 3C (agonism) or Table 4C (antagonism).
21.
Only microtiter plates that fulfil all above mentioned acceptance criteria are considered valid and can be used to evaluate the response of test chemicals.
22.
The acceptance criteria are applicable to both prescreen and comprehensive runs.
Table 1
Concentrations of reference standard, positive control (PC) and negative control (NC) for the agonistic CALUX bioassay
Substance
CAS RN
Test range (M)
Reference standard
17β-estradiol
50-28-2
1*10–13 – 1*10–10
Positive control (PC)
17α-methyltestosterone
58-18-4
3*10–6
Negative control (NC)
corticosterone
50-22-6
1*10–8
Table 2
Concentrations of reference standard, positive control (PC) and negative control (NC) for the antagonistic CALUX bioassay
Substance
CAS RN
Test range (M)
Reference standard
tamoxifen
10540-29-1
3*10–9 – 1*10–5
Positive control (PC)
4-hydroxytamoxifen
68047-06-3
1*10–9
Negative control (NC)
resveratrol
501-36-0
1*10–5
Table 3
Acceptance criteria for the agonistic ERα CALUX bioassay
A - individual samples on a plate
Criterium
1
Maximum %SD of triplicate wells (for NC, PC, each dilution of the test chemical and the reference standard, except C0)
< 15 %
2
Maximum %SD of triplicate wells (for reference standard and test chemical solvent controls (C0, SC))
< 30 %
3
Maximum LDH leakage, as a measure of cytotoxicity.
< 120 %
B - within a single microtiter plate
4
Ratio of the reference standard solvent control (C0; plate 1) and test chemical solvent control (SC; plates 2 to x)
0,5 to 2,0
5
Ratio of the appr. EC50 and highest reference standard concentrations on plate 1 and the appr. EC50 and highest reference standard concentrations on plates 2 to x (C4, C8)
0,70 to 1,30
6
Z-factor for each plate
>0.6
C - within a single series of analyses (all plates within one series)
7
Sigmoidal curve of reference standard
Yes (17ß-estradiol)
8
EC50 range reference standard 17ß-estradiol
4*10–12 – 4*10–11 M
9
Minimum fold induction of the highest 17ß-estradiol concentration, with respect to the reference standard solvent control.
5
10
Relative induction ( %) PC.
> 30 %
11
Relative induction ( %) NC
<10 %
Appr.: approximative; PC: positive control; NC: negative control; SC: test chemical solvent control; C0: reference standard solvent control; SD: standard deviation; LDH: lactate dehydrogenase
Table 4
Acceptance criteria for the antagonistic ERα CALUX bioassay
A - individual samples on a plate
Criterium
1
Maximum %SD of triplicate wells (for NC, PC, each dilution of the test chemical and the reference standard, solvent control (C0))
< 15 %
2
Maximum %SD of triplicate wells (for vehicle control (VC) and highest reference standard concentration (C8))
< 30 %
3
Maximum LDH leakage, as a measure of cytotoxicity.
< 120 %
B - within a single microtiter plate
4
Ratio of the reference standard solvent control (C0; plate 1) and test chemical solvent control (SC; plates 2 to x)
0,70 to 1,30
5
Ratio of the appr. IC50 reference standard concentrations on plate 1 and the appr. IC50 reference standard concentrations on plates 2 to x (C4)
0,70 to 1,30
6
Ratio of the highest reference standard concentrations on plate 1 and the highest reference standard concentrations on plates 2 to x (C8)
0,50 to 2,0
7
Z-factor for each plate
> 0,6
C - within a single series of analyses (all plates within one series)
8
Sigmoidal curve of reference standard
Yes (Tamoxifen)
9
IC50 range reference standard (Tamoxifen)
1*10–8 - 1*10–7 M
10
Minimum fold induction of the reference standard solvent control, with respect to the highest Tamoxifen concentration.
2,5
11
Relative induction ( %) PC.
< 70 %
12
Relative induction ( %) NC
> 85 %
Appr.: approximative; PC: positive control; NC: negative control; VC: vehicle control (solvent control without fixed concentration of agonist reference standard); SC: test chemical solvent control; C0: reference standard solvent control; SD: standard deviation; LDH: lactate dehydrogenase
For both the prescreen run and comprehensive runs, the same solvent/vehicle control, reference standards, positive controls and negative controls should be used. In addition, the concentration of reference standards, positive controls and negative controls should be the same.
Solvent control
24.
The solvent used to dissolve the test chemicals should be tested as a solvent control. Dimethylsulfoxide (DMSO, 1 % (v/v); CASRN 67-68-5) was used as vehicle during the validation of the ERα CALUX bioassay. If a solvent other than DMSO is used, all reference standards, controls, and test chemicals should be tested in the same vehicle. Please note that the solvent control for antagonistic studies contains a fixed concentration of the agonist reference standard 17β-estradiol (approximately EC50 concentration). To test the solvent used for antagonistic studies, a vehicle control should be prepared and tested.
Vehicle control (antagonism)
25.
For testing antagonism, the assay medium is supplemented with a fixed concentration of the agonist reference standard 17β-estradiol (approximately EC50 concentration). To test the solvent used to dissolve the test chemicals for antagonism, an assay medium without a fixed concentration of the agonist reference standard 17β-estradiol should be prepared. This control sample is indicated as the vehicle control. Dimethylsulfoxide (DMSO, 1 % (v/v); CASRN 67-68-5) was used as vehicle during the validation of the ERα CALUX bioassay. If a solvent other than DMSO is used, all reference standards, controls, and test chemicals should be tested in the same vehicle.
Reference standards
26.
The agonistic reference standard is 17β-estradiol (Table 1). The reference standards comprise a series of dilutions of eight concentrations of 17β-estradiol (1*10–13, 3*10–13, 1*10–12, 3*10–12, 6*10–12, 1*10–11, 3*10–11, 1*10–10 M).
27.
The antagonistic reference standard is tamoxifen (Table 2). The reference standards comprise a series of dilutions of eight concentrations of tamoxifen (3*10–9, 1*10–8, 3*10–8, 1*10–7, 3*10–7, 1*10–6, 3*10–6, 1*10–5 M). Each of the concentrations of the antagonistic reference standard is co-incubated with a fixed concentration of the agonistic reference standard 17β-estradiol (3*10–12 M).
Positive control
28.
The positive control for agonistic studies is 17α-methyltestosterone (Table 1).
29.
The positive control for antagonistic studies is 4-hydroxytamoxifen (Table 2). The antagonistic positive control is co-incubated with a fixed concentration of the agonistic reference standard 17β-estradiol (3*10–12 M).
Negative control
30.
The negative control for agonistic studies is corticosterone (Table 1).
31.
The negative control for antagonistic studies is resveratrol (Table 2). The antagonistic negative control is co-incubated with a fixed concentration of the agonistic reference standard 17β-estradiol (3*10–12 M).
Demonstration of laboratory proficiency (see paragraph 14 and Tables 3 and 4 in «ER TA ASSAY COMPONENTS» of this test method)
Vehicle
32.
The solvent used to dissolve test chemicals should solubilise the test chemical completely and should be miscible with the cell medium. DMSO, water and ethanol (95 % to 100 % purity) are suitable solvents. In case DMSO is used as solvent, the maximum concentration of DMSO during incubation should not exceed 1 % (v/v). Prior to use, the solvent should be tested for absence of cytotoxicity and interference with the assays performance.
Preparation of reference standards, positive controls, negative controls and test chemicals
33.
Reference standards, positive controls, negative controls and test chemicals are dissolved in 100 % DMSO (or an appropriate solvent). Appropriate (serial) dilutions should then be prepared in the same solvent. Before being dissolved, all substances should be allowed to equilibrate to room temperature. Freshly prepared stock solutions of reference standards, positive controls, negative controls and test chemicals should not have noticeable precipitate or cloudiness. Reference standard and control stocks may be prepared in bulk. Stock solutions of test chemicals should be prepared fresh before each experiment. Final dilutions of reference standards, positive controls, negative controls and test chemicals should be prepared for each experiment fresh and used within 24 hours of preparation.
Solubility, cytotoxicity and range finding.
34.
During the prescreen run, the solubility of the test chemicals in the solvent of choice is determined. A maximum stock concentration of 0,1 M is prepared. In case this concentration shows solubility problems, lower stock solutions should be prepared until test chemicals are fully solubilised. During the prescreen run, 1:10 serial dilutions of test chemical are tested. The maximum assay concentration for agonist or antagonist testing is 1 mM. Following prescreening, an appropriate refined concentration range for test chemicals is derived that should be tested during the comprehensive runs. The dilutions used for comprehensive testing should be 1x, 3x, 10x, 30x, 100x, 300x, 1000x and 3000x.
35.
Cytotoxicity testing is included in the agonist and antagonist assay protocol (11). Cytotoxicity testing is incorporated in both the prescreen run and comprehensive runs. The method used to assess cytotoxicity during the validation of the ERα CALUX bioassay was the lactate dehydrogenase (LDH) leakage test in combination with qualitative visual inspection of cells (see Appendix 4.1) following exposure to test chemicals. However, other quantitative methods for the determination of cytotoxicity (e.g. tetrazolium-based colorimetric (MTT) assay or cytotoxicity CALUX bioassay) can be used. In general, test chemical concentrations that show more than 20 % reduction of cell viability are considered cytotoxic and therefore cannot be used for data evaluation. With respect to the LDH leakage assay, the concentration of the test chemical is regarded cytotoxic when the percentage LDH leakage is higher than 120 %.
Test chemical exposure and assay plate organisation
36.
Following trypsination of a confluent flask of cultured cells, cells are re-suspended at 1x105 cells/ml in estrogen free assay medium. Hundred μl of re-suspended cells are plated in the inner-wells of a 96-well microtiter plate. The outer wells are filled with 200 μl of Phosphate Buffered Saline (PBS) (see Figures 1 and 2). The plated cells are pre-incubated for 24 hours in a CO2 incubator (5 % CO2, 37oC, 100 % humidity).
37.
After pre-incubation, the plates are inspected for visual cytotoxicity (see Appendix 4.1), contamination and confluence. Only plates that show no visual cytotoxicity, contamination and have a minimum of 85 % confluence are used for testing. The medium from the inner wells is carefully removed and replaced by 200 μl of estrogen free assay medium containing appropriate dilutions series of reference standards, test chemicals, positivecontrols, negative controls and solvent controls (Table 5: agonist studies; Table 6: antagonist studies). All reference standards, test chemicals, positive controls, negative controls and solvent controls are tested in triplicate. In Figure 1, the plate layout for agonist testing is given. In Figure 2, the plate layout for antagonist testing is given. The plate layout for prescreen testing and comprehensive testing is identical. For antagonist testing, all inner-wells, except for the vehicle control wells (VC), also contain a fixed concentration of agonist reference standard 17β-estradiol (3*10–12 M). Note that reference standards C8 and C4 should be added to each TC plate.
38.
Following exposure of the cells to all chemicals, the 96-well microtiter plates should be incubated for another 24 hours in a CO2 incubator (5 % CO2, 37oC, 100 % humidity).
Figure 1
Plate layout of the 96-well microtiter plates for prescreening and assessment of agonistic effect.
C0 = reference standard solvent.
C(1-8) = series of dilutions (1-8, low-to-high concentrations) of reference standard.
PC = positive control.
NC = negative control.
TCx-(1-8) = dilutions (1-8, low-to-high concentrations) of test chemical for the prescreen run and assessment of agonistic effect of test chemical x.
SC = solvent control of the test chemical (optimally the same solvent as in C0, but possibly from another batch).
Grey cells: = Outer wells, filled up with 200 μl of PBS.
Figure 2
Plate layout of the 96-well microtiter plates for antagonistic prescreening and assessment of antagonistic effect.