Joseph Marini, Senior Research Biologist, presented the webinar, “The Future of High-Tier Amphibian Testing – Navigating Challenges and Exploring Advancements in the Larval Amphibian Growth and Development Assay (LAGDA).”
Read below for responses to the questions asked during the audience Q&A.
Q: What is the most challenging aspect of testing a UVCB using this model?
A: It would depend on the characteristics, but for instance characteristics of petroleum-derived substances or surfactant blends would pose a significant challenge with the LAGDA as their behavior in the diluter system is unpredictable and they are difficult to analyze. The length of the exposure period also poses a challenge as the variability of analytical data when testing a UVCB would be expected to be high.
Q: Please elaborate on the possibility of testing thyroid in fish.
A: The Validation Management Group for Ecotoxicity Testing (VMG-ECO) has been looking into potential thyroid hormone system related endpoints for inclusion in existing OECD fish test guidelines, specifically No. 236 (fish embryo test) and No. 210 (fish early life-stage test). A detailed review paper was published by OECD on the TH system in fish and potential endpoints.
There are multiple projects have been working on this initiative: Project 2.64 of the OECD Test Guidelines work plan “Inclusion of thyroid endpoints in OECD fish Test Guidelines” is led by Denmark with participation of Germany, Belgium, and the Netherlands, as part of the EndocRine Guideline Optimisation group. Closely in tandem with this project has been Project 1.35 of the OECD AOP development program. Essentially, these projects are working to establish adverse outcome pathway networks for thyroid disruption in fish and use those AOPs to potentially increase regulatory acceptance of certain endpoints.
There is now an inter-laboratory validation underway to look at thyroid morphology/histopathology, swim bladder inflation, eye development, and hormone levels in OECD 236 and OECD 210. Several reference substances as well as negative substances are to be used, as well as explore potential utility of multiple species (zebrafish, fathead minnow, Japanese medaka).
Smithers is monitoring the progress of this work closely. There are also several recent journal articles published by the above groups, mostly with zebrafish, regarding the development of thyroid systems in fish.
Q: Have you ever received feedback from regulatory authorities on adequacy of the MTC, either for the AMA or the LAGDA?
A: We have not received feedback on the adequacy of the MTC on a LAGDA. An example from an AMA is: a preliminary study was performed at concentrations below 100 mg/L and below the water solubility of the test substance. The high treatment concentration chosen for the definitive study was at a concentration where no treatment-related effects were observed in the preliminary study. This resulted in regulatory authorities criticizing the adequacy of the MTC – that it was not high enough. Over time, regulatory authority evaluation and scrutiny of the MTC has increased. This has resulted in more explanation required of high-test concentration in endocrine study reports.
Q: Based on your experience, does it often occur that a LAGDA actually provides more info on the T-modality compared to an (extended) AMA?
A: Data for this scenario is limited since there have been very few LAGDAs conducted which also have (E)AMA. However, from what we’ve seen, the LAGDA has not provided any additional information on the T-modality compared to an AMA or an Extended AMA. In one instance, an extended AMA was conducted prior to the LAGDA; the extended AMA was actually more sensitive in detecting time-to-stage NF62. Thyroid gland histology findings were similar and of similar sensitivity between the two studies. Also, the extended AMA resulted in changes in somatic growth e.g. body weight where the LAGDA did not detect such an effect.
Q: Have you experienced challenges with analytical recovery for highly adsorptive chemicals that may bind to the suspended feed particles?
A: The LAGDAs we’ve performed have utilized column saturators to dose the test systems – so the molecules were quite adsorptive – however we did not experience substantially reduced analytical recoveries of the test substances in those aquarium solutions. Whether the molecules did not bind to the suspended particles – or they did in some capacity and the flow-through design prevented us from seeing it in the analytical data – is difficult to determine. That data was also interesting since other potential binding sites increase dramatically in this study type e.g. the biofilm and waste buildup (especially after the larval phase) as the metamorphosed frogs grow.
The larval diet is quite dense, and so some of it settles out on the tank bottom for the tadpoles to forage on (similar to fish food) …i.e. the entire aliquot isn’t suspended for an extended period. But to your point I would expect highly adsorptive chemicals to bind to it in some capacity.
Q: What dosing methods do you use for volatiles / insoluble substances to maintain concentrations for the test duration?
A: For insoluble substances that are hydrolytically stable, Smithers often uses column saturators to dose the test systems. As described in the OECD Guidance Document on Aqueous-Phase Aquatic Toxicity Testing of Difficult Test Chemicals, this method is especially recommended for low solubility substances. Smithers has a long history of success with this type of solvent-free method in endocrine studies. Substances that are dosed using column saturators typically are highly adsorptive. Therefore, there are many potential binding sites in the exposure system, such as the glassware, silicone, the suspended larval food particles, waste, and biofilm buildup, or the animals themselves from bioaccumulation. Smithers has conducted two LAGDAs recently that utilized column saturators to dose – and both studies resulted in analytical recoveries of the test substance at approximately 80 to 120% nominal concentration over the entire test duration. Whether the molecules did not bind to the waste, glass, or food particles – or they did in some capacity and the high flow-through design prevented it from resulting in reduced recoveries in the analytical data – is difficult to determine.
The minimum test solution renewal rate in the OECD 241 TG is 5 per day. A good method to employ for volatile/insoluble substances is to increase the solution volume replacement rate above this minimum. This can be a way to maintain test concentrations. However, we typically set a maximum limit for this renewal rate since the suspended larval food can wash out with the flow of water. Flushing the food out too quickly could impact the animals’ access to food and potentially impact growth rates, which are a control performance criterion.
Q: How many µL of blood do you get from a tadpole at NF 62?
A: We’ve found it is typically correlated to body weight. Tadpoles around 1 gram can typically yield 10 to 15 µL. A couple have maxed out at nearly 20 µL. Smaller tadpoles are between 5 to 10, sometimes < 5 µL. The smaller they are, the more variable the volumes seem to be.
Q: Can you talk to the failure rate of the validity/performance criteria for the LAGDA?
A: The amount of “HCD” to this point isn’t much, but in the completed studies (two), meeting the criteria has depended on the criterion… Time to Metamorphosis (TTM; ≤45 days) and survival (≥80% in all reps) have been met. Water quality was met. The weight at NF62 and test termination hovered in or around the outer limits of the acceptable ranges. I think this speaks more to natural variation of the species, as I believe weight is a parameter that can vary more easily between labs/diet/rearing space/volume replacement rate, etc. TTM and weight variability amongst control replicates in our LAGDAs was as good and sometimes better than most data sets reported in the inter-lab validation summary. Regarding test concentrations maintained within ±20% of mean measured values, that was met in both studies. We had previously developed a very robust saturator column method for both compounds in previous programs and both substances behaved over the course of the tests.
Q: How bad microbial growth get when using solvent? Is there any data on whether there’s a significant impact on reducing solvent load rate from, say 20 uL/L to 10 uL/L? Or is there none and it’s just logical to assume that less carbon source, less microbial growth?
A: Of course, “bad” is a relative term but it seems very dependent on solvent type not just on solvent concentration. Using triethylene glycol (TEG), amount of biofilm is minimal up to 20 µL/L. We’ve had success in other chronic testing up to 30 µL/L but wouldn’t necessarily recommend that. This doesn’t mean that solvent effects (e.g. growth) never occur vs. negative control with TEG, but the cleanliness of the system in general is substantially better and as a result is much easier to maintain. DMF is likely “next best”, although we have observed solvent effects using concentrations as low as 10 µL/L in amphibian testing. Acetone has been notoriously worse in terms of biofilm buildup. We try to stay away from it if possible. Plus, the Mount & Brungs diluter apparatuses have a lot of solution in glassware exposed to light, which invites algal growth.
From 20 to 10 µL/L, lower is always better but it depends on context of testing, in other words- if you’re decreasing solvent concentration, that means you need a more concentrated test substance stock solution. So, if you’re butting up against solubility of test substance in that solvent, it could create more issues than it's worth (problems with dosing the diluter reliably due to precipitate formation in your stock). Multiple variables to consider when assessing what solvent concentration to use. Not sure there is enough amphibian data yet to say how much better 10 is than 20 µL/L.
Q: Do you know of any LADGAs that were rejected or criticized by regulators because they disagreed that the MTC used in the study wasn't high enough?
A: Not that we know of. The LAGDAs we performed had AMA data available, and those AMAs had range-finder data where overt toxicity was observed. So tested up to MTC. Those AMA definitive studies also had effects in endpoints that are considered “thyroid-mediated” in the ECHA/EFSA guidance documents so there were concentrations to investigate as to whether they were adverse or not in higher tier study.
Q: If there are unfavorable results in LAGDA - is there a next step??
A: If ‘unfavorable’ implies thyroid-mediated effects that show adverse population-relevant effects, I think it would also depend on the regulatory body/industry. For instance, the ECHA/EFSA guidance document states that ED criteria would be met under this scenario and mentions a mode of action analysis may be necessary when potentially endocrine-related adverse effects and endocrine activity are identified and there is a need to establish biological plausibility of the link between adverse effect and the activity.
However, we have not received regulatory feedback from customers regarding any LAGDA results yet, so it is difficult to say at this point, as the regulatory environment seems fluid. I think it would also depend on the overall weight of evidence e.g. data elsewhere in the dossier of the substance submission. It is difficult to determine since the utility of the LAGDA is limited to fully capture adverse effects since there is no reproduction phase, and no level 5 OECD conceptual framework studies incorporate thyroid endpoints (perhaps in the future for fish testing).
Q: What are your thoughts regarding using weight of xenopus as an endpoint in the AMA given that metamorphosis results in a reduction in weight followed by a return to increasing weight?
A: The AMA is a reliable and sensitive assay for assessing thyroid activity. The test guideline’s recommendation for excluding tadpoles that have reduced weight (e.g. late-stage tadpoles) helps censor potential confounding animals on the results. However, having to exclude them can reduce the statistical power of the test and should be considered, especially if an appreciable number (e.g. >20%) reach a stage where their body weight begins to be reduced. Animals in the AMA do not typically return to increasing weight as those animals would be beyond typical AMA life-stage.
Q: For Xenopus: What would be the best stage for performing thyroid histopathology?
A: Thyroid morphology is similar in several stages of tadpoles that are typically analyzed for the thyroid histopathology endpoint in the AMA so as long as they are stage-matched, there isn’t a superior stage one over another. However, the stage 62 is ecologically relevant given that thyroid hormone concentrations peak at this stage. As it is a milestone in the EAMA and LAGDA, this would be a preferable stage for histopathology.
Q: Have there been any extended AMAs performed with reference chemicals?
A: Smithers performed an extended AMA with the reference chemical propylthiouracil (PTU) in 2024. Literature from US EPA has also performed a similar time to stage design (Haselman et al. 2020, https://doi.org/10.1093/toxsci/kfaa036), 3-nitro-L-tyrosine (Olker et al. 2018, https://doi.org/10.1093/toxsci/kfy203).
Q: You mentioned that the control replicates increase to 16 if solvent control group is included (+8 reps). What solvents are typically used and is this a last resort for low solubility substances?
A: Typical preference is triethylene glycol (TEG), acetone (ACT), and dimethylformamide (DMF) in that order. TEG is preferable to the other two; ACT and DMF are distant second/third. For low solubility substances, if rapid hydrolysis is not a physchem property concern, investigation into solvent-free methods is strongly recommended and any co-solvent use is treated as a last resort. Solvent-free methods such as saturator columns, similar to those described in the OECD Guidance Document on Aqueous-Phase Aquatic Toxicity Testing of Difficult Test Chemicals, are especially recommended for low solubility substances. Smithers has a long history of success with solvent-free methods.
Q: As stated in the GD 150 "The top dose or concentration should be sufficiently high to give clear systemic (i.e. non-endocrine specific) toxicity in order to ensure that a wide range of exposures (high to low) is tested." What would you say would be clear systemic toxicity? X% mortality e.g.?
A: During any range-finding exposure, there should be evidence of overt toxicity (>10% mortality) in at least one range-finding test concentration to ensure the maximum tolerable concentration is sufficiently reached. However, when determining highest test concentration for the definitive test, Smithers has learned through experience performing many endocrine studies that using mortality as the primary or only parameter is difficult and imprecise. Typically, mortality and sub-lethal evidence of overt toxicity should be carefully assessed in tandem since often there can be a shallow concentration-response – e.g., abnormal appearance/behavior (for instance: discoloration, uneaten food differences in treatment aquaria vs. controls [evidence of reduced feeding], and lethargy, to name a few. Wet body weight is a good secondary parameter to use as evidence of systemic toxicity; generally at least 10% difference in weight has been used to set the MTC but again should be used on a case-by-case basis and in concert with other endpoints.
Q: Recently we have had issues in Europe regarding not testing up to the MTC? How does Smithers address this issue when determining concentrations in the range-finding tests?
A: Similar to the previous question, it is recommended to induce overt mortality in the range-finding test to ensure the MTC is bracketed but use other endpoints and observations to define the high test concentration for the definitive test.
Q: In studies where there is a clear effect on T3/T4 and developmental delays are seen, is the exposure period extended? How common is this? Is analysis done in real-time such that this information can be evaluated during the critical window (e.g., D60-D66) to inform whether to extend the study?
A: In LAGDA studies where T3/T4 are measured on a sub-sample of tadpoles at NF stage 62, the results do not dictate whether or not the study is continued. The test design continues to juvenile exposure regardless of T3/T4 results.
Q: Could you please give some typical examples of using the LAGDA in the context of pharmaceutical drug development, including the relevance of hormone level measurements?
A: LAGDA with hormone level measurements could be used for example if mammalian data shows thyroid activity/adversity effects and changes in thyroid hormone concentrations (e.g., OECD test guidelines 408, 414, 421, 422, 443). While time to metamorphosis (days to reach NF stage 62) is sensitive to thyroid-active compounds, this endpoint should be interpreted with caution because non-specific toxicity experience prior to metamorphosis may impact the length of time it takes to reach NF stage 62. While thyroid gland histology at NF stage 62 is a hallmark for diagnosing thyroid disruption, adding thyroid hormone measurements can be added to provide supporting information on the mode of action.
Q: Is the LAGDA currently conducted in the area of pesticide registrations?
A: There have been two LAGDAs conducted at our facility for agrochemical substances for European submission.
Q: What change in TH level would be biologically meaningful: are there any experiences?
A: Smithers has not conducted GLP studies with thyroid hormones yet, but we will be embarking on several studies this year with TH’s as endpoints in amphibian studies. There are a few articles that describe work with thyroid active compounds and measurements of thyroid hormones in Xenopus laevis. I’ve attached them to this email. Most of it has been at U.S. EPA. Depending on the hormone, a reduction/induction of anywhere from ~30%, sometimes up to an order of magnitude of effect was detected as statistically significant in treated organisms. There are many factors that can impact this, some of which was described in the webinar such as but not limited to: inherent variability of endogenous concentrations, whether aliquots of plasma are pooled from multiple individuals or not, time of day, assay sensitivity, etc.
Another aspect is whether or not those hormone changes result in an apical effect – e.g. does a biochemical change result in an adverse outcome. If not, the biological relevance could potentially be questioned. Wheeler et al published regarding the utility of hormones which describes considerations to relevance to regulatory testing. It covers several aspects including investigating what kind of replication would be needed to have adequate statistical power, whether or not the existing guidelines are sufficient for ED testing without hormones added, technical limitations, etc.
Q: How many LADGAs has Smithers conducted?
A: As of March 2025, Smithers in Wareham, Massachusetts, has conducted two LAGDAs and there are 3 more booked for 2025/2026.
Q: Have you performed any extended AMAs?
A: Smithers has performed 12 extended AMAs since 2020.