Skip to main content
eScholarship
Open Access Publications from the University of California
Cover page of Structural and quantum chemical basis for OCP-mediated quenching of phycobilisomes.

Structural and quantum chemical basis for OCP-mediated quenching of phycobilisomes.

(2024)

Cyanobacteria use large antenna complexes called phycobilisomes (PBSs) for light harvesting. However, intense light triggers non-photochemical quenching, where the orange carotenoid protein (OCP) binds to PBS, dissipating excess energy as heat. The mechanism of efficiently transferring energy from phycocyanobilins in PBS to canthaxanthin in OCP remains insufficiently understood. Using cryo-electron microscopy, we unveiled the OCP-PBS complex structure at 1.6- to 2.1-angstrom resolution, showcasing its inherent flexibility. Using multiscale quantum chemistry, we disclosed the quenching mechanism. Identifying key protein residues, we clarified how canthaxanthins transition dipole moment in its lowest-energy dark state becomes large enough for efficient energy transfer from phycocyanobilins. Our energy transfer model offers a detailed understanding of the atomic determinants of light harvesting regulation and antenna architecture in cyanobacteria.

Cover page of An integrated metagenomic, metabolomic and transcriptomic survey of Populus across genotypes and environments.

An integrated metagenomic, metabolomic and transcriptomic survey of Populus across genotypes and environments.

(2024)

Bridging molecular information to ecosystem-level processes would provide the capacity to understand system vulnerability and, potentially, a means for assessing ecosystem health. Here, we present an integrated dataset containing environmental and metagenomic information from plant-associated microbial communities, plant transcriptomics, plant and soil metabolomics, and soil chemistry and activity characterization measurements derived from the model tree species Populus trichocarpa. Soil, rhizosphere, root endosphere, and leaf samples were collected from 27 different P. trichocarpa genotypes grown in two different environments leading to an integrated dataset of 318 metagenomes, 98 plant transcriptomes, and 314 metabolomic profiles that are supported by diverse soil measurements. This expansive dataset will provide insights into causal linkages that relate genomic features and molecular level events to system-level properties and their environmental influences.

Cover page of An Engineered Laccase from Fomitiporia mediterranea Accelerates Lignocellulose Degradation.

An Engineered Laccase from Fomitiporia mediterranea Accelerates Lignocellulose Degradation.

(2024)

Laccases from white-rot fungi catalyze lignin depolymerization, a critical first step to upgrading lignin to valuable biodiesel fuels and chemicals. In this study, a wildtype laccase from the basidiomycete Fomitiporia mediterranea (Fom_lac) and a variant engineered to have a carbohydrate-binding module (Fom_CBM) were studied for their ability to catalyze cleavage of β-O-4 ether and C-C bonds in phenolic and non-phenolic lignin dimers using a nanostructure-initiator mass spectrometry-based assay. Fom_lac and Fom_CBM catalyze β-O-4 ether and C-C bond breaking, with higher activity under acidic conditions (pH < 6). The potential of Fom_lac and Fom_CBM to enhance saccharification yields from untreated and ionic liquid pretreated pine was also investigated. Adding Fom_CBM to mixtures of cellulases and hemicellulases improved sugar yields by 140% on untreated pine and 32% on cholinium lysinate pretreated pine when compared to the inclusion of Fom_lac to the same mixtures. Adding either Fom_lac or Fom_CBM to mixtures of cellulases and hemicellulases effectively accelerates enzymatic hydrolysis, demonstrating its potential applications for lignocellulose valorization. We postulate that additional increases in sugar yields for the Fom_CBM enzyme mixtures were due to Fom_CBM being brought more proximal to lignin through binding to either cellulose or lignin itself.

Cover page of RNA-guided genome engineering: paradigm shift towards transposons

RNA-guided genome engineering: paradigm shift towards transposons

(2024)

CRISPR-Cas systems revolutionized the genome engineering field but need to induce double-strand breaks (DSBs) and may be difficult to deliver due to their large protein size. Tn7-like transposons such as CRISPR-associated transposons (CASTs) can be repurposed for RNA-guided DSB-free integration, and obligate mobile element guided activity (OMEGA) proteins of the IS200/IS605 transposon family have been developed as hypercompact RNA-guided genome editing tools. CASTs and OMEGA are exciting, innovative genome engineering tools that can improve the precision and efficiency of editing. This review explores the recent developments and uses of CASTs and OMEGA in genome editing across prokaryotic and eukaryotic cells. The pros and cons of these transposon-based systems are deliberated in comparison to other CRISPR systems.

Metagenomics untangles potential adaptations of Antarctic endolithic bacteria at the fringe of habitability

(2024)

Survival and growth strategies of Antarctic endolithic microbes residing in Earth's driest and coldest desert remain virtually unknown. From 109 endolithic microbiomes, 4539 metagenome-assembled genomes were generated, 49.3 % of which were novel candidate bacterial species. We present evidence that trace gas oxidation and atmospheric chemosynthesis may be the prevalent strategies supporting metabolic activity and persistence of these ecosystems at the fringe of life and the limits of habitability.

Cover page of Experimental warming accelerates positive soil priming in a temperate grassland ecosystem.

Experimental warming accelerates positive soil priming in a temperate grassland ecosystem.

(2024)

Unravelling biosphere feedback mechanisms is crucial for predicting the impacts of global warming. Soil priming, an effect of fresh plant-derived carbon (C) on native soil organic carbon (SOC) decomposition, is a key feedback mechanism that could release large amounts of soil C into the atmosphere. However, the impacts of climate warming on soil priming remain elusive. Here, we show that experimental warming accelerates soil priming by 12.7% in a temperate grassland. Warming alters bacterial communities, with 38% of unique active phylotypes detected under warming. The functional genes essential for soil C decomposition are also stimulated, which could be linked to priming effects. We incorporate lab-derived information into an ecosystem model showing that model parameter uncertainty can be reduced by 32-37%. Model simulations from 2010 to 2016 indicate an increase in soil C decomposition under warming, with a 9.1% rise in priming-induced CO2 emissions. If our findings can be generalized to other ecosystems over an extended period of time, soil priming could play an important role in terrestrial C cycle feedbacks and climate change.

Whole community shotgun metagenomes of two biological soil crust types from the Mojave Desert.

(2024)

We present six whole community shotgun metagenomic sequencing data sets of two types of biological soil crusts sampled at the ecotone of the Mojave Desert and Colorado Desert in California. These data will help us understand the diversity and function of biocrust microbial communities, which are essential for desert ecosystems.

Cover page of An evaluation of GPT models for phenotype concept recognition.

An evaluation of GPT models for phenotype concept recognition.

(2024)

OBJECTIVE: Clinical deep phenotyping and phenotype annotation play a critical role in both the diagnosis of patients with rare disorders as well as in building computationally-tractable knowledge in the rare disorders field. These processes rely on using ontology concepts, often from the Human Phenotype Ontology, in conjunction with a phenotype concept recognition task (supported usually by machine learning methods) to curate patient profiles or existing scientific literature. With the significant shift in the use of large language models (LLMs) for most NLP tasks, we examine the performance of the latest Generative Pre-trained Transformer (GPT) models underpinning ChatGPT as a foundation for the tasks of clinical phenotyping and phenotype annotation. MATERIALS AND METHODS: The experimental setup of the study included seven prompts of various levels of specificity, two GPT models (gpt-3.5-turbo and gpt-4.0) and two established gold standard corpora for phenotype recognition, one consisting of publication abstracts and the other clinical observations. RESULTS: The best run, using in-context learning, achieved 0.58 document-level F1 score on publication abstracts and 0.75 document-level F1 score on clinical observations, as well as a mention-level F1 score of 0.7, which surpasses the current best in class tool. Without in-context learning, however, performance is significantly below the existing approaches. CONCLUSION: Our experiments show that gpt-4.0 surpasses the state of the art performance if the task is constrained to a subset of the target ontology where there is prior knowledge of the terms that are expected to be matched. While the results are promising, the non-deterministic nature of the outcomes, the high cost and the lack of concordance between different runs using the same prompt and input make the use of these LLMs challenging for this particular task.

A bacterial sensor taxonomy across earth ecosystems for machine learning applications.

(2024)

Microbial communities have evolved to colonize all ecosystems of the planet, from the deep sea to the human gut. Microbes survive by sensing, responding, and adapting to immediate environmental cues. This process is driven by signal transduction proteins such as histidine kinases, which use their sensing domains to bind or otherwise detect environmental cues and transduce signals to adjust internal processes. We hypothesized that an ecosystems unique stimuli leave a sensor fingerprint, able to identify and shed insight on ecosystem conditions. To test this, we collected 20,712 publicly available metagenomes from Host-associated, Environmental, and Engineered ecosystems across the globe. We extracted and clustered the collections nearly 18M unique sensory domains into 113,712 similar groupings with MMseqs2. We built gradient-boosted decision tree machine learning models and found we could classify the ecosystem type (accuracy: 87%) and predict the levels of different physical parameters (R2 score: 83%) using the sensor cluster abundance as features. Feature importance enables identification of the most predictive sensors to differentiate between ecosystems which can lead to mechanistic interpretations if the sensor domains are well annotated. To demonstrate this, a machine learning model was trained to predict patients disease state and used to identify domains related to oxygen sensing present in a healthy gut but missing in patients with abnormal conditions. Moreover, since 98.7% of identified sensor domains are uncharacterized, importance ranking can be used to prioritize sensors to determine what ecosystem function they may be sensing. Furthermore, these new predictive sensors can function as targets for novel sensor engineering with applications in biotechnology, ecosystem maintenance, and medicine.IMPORTANCEMicrobes infect, colonize, and proliferate due to their ability to sense and respond quickly to their surroundings. In this research, we extract the sensory proteins from a diverse range of environmental, engineered, and host-associated metagenomes. We trained machine learning classifiers using sensors as features such that it is possible to predict the ecosystem for a metagenome from its sensor profile. We use the optimized models feature importance to identify the most impactful and predictive sensors in different environments. We next use the sensor profile from human gut metagenomes to classify their disease states and explore which sensors can explain differences between diseases. The sensors most predictive of environmental labels here, most of which correspond to uncharacterized proteins, are a useful starting point for the discovery of important environment signals and the development of possible diagnostic interventions.