The inverse problem of designing component interactions to target emergent structure is fundamental to numerous applications in biotechnology, materials science, and statistical physics. Equally important is the inverse problem of designing emergent kinetics, but this has received considerably less attention. Using recent advances in automatic differentiation, we show how kinetic pathways can be precisely designed by directly differentiating through statistical physics models, namely free energy calculations and molecular dynamics simulations. We consider two systems that are crucial to our understanding of structural self-assembly: bulk crystallization and small nanoclusters. In each case, we are able to assemble precise dynamical features. Using gradient information, we manipulate interactions among constituent particles to tune the rate at which these systems yield specific structures of interest. Moreover, we use this approach to learn nontrivial features about the high-dimensional design space, allowing us to accurately predict when multiple kinetic features can be simultaneously and independently controlled. These results provide a concrete and generalizable foundation for studying nonstructural self-assembly, including kinetic properties as well as other complex emergent properties, in a vast array of systems.
Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contains rich information about biological state, including cell type, cell cycle phase, gene regulatory patterns, and location within the tissue of origin. A major challenge is to disentangle information about these different biological states from each other, including distinguishing from cell lineage, since the correlation of cellular expression patterns is necessarily contaminated by ancestry. Here, we use a recent advance in random matrix theory, discovered in the context of protein phylogeny, to identify differentiation or ancestry-related processes in single-cell data. Qin and Colwell [C. Qin, L. J. Colwell, Proc. Natl. Acad. Sci. U.S.A. 115, 690-695 (2018)] showed that ancestral relationships in protein sequences create a power-law signature in the covariance eigenvalue distribution. We demonstrate the existence of such signatures in scRNA-seq data and that the genes driving them are indeed related to differentiation and developmental pathways. We predict the existence of similar power-law signatures for cells along linear trajectories and demonstrate this for linearly differentiating systems. Furthermore, we generalize to show that the same signatures can arise for cells along tissue-specific spatial trajectories. We illustrate these principles in diverse tissues and organisms, including the mammalian epidermis and lung, Drosophila whole-embryo, adult Hydra, dendritic cells, the intestinal epithelium, and cells undergoing induced pluripotent stem cells (iPSC) reprogramming. We show how these results can be used to interpret the gradual dynamics of lineage structure along iPSC reprogramming. Together, we provide a framework that can be used to identify signatures of specific biological processes in single-cell data without prior knowledge and identify candidate genes associated with these processes.
Recent advances in synthetic posttranslational protein circuits are substantially impacting the landscape of cellular engineering and offer several advantages compared to traditional gene circuits. However, engineering dynamic phenomena such as oscillations in protein-level circuits remains an outstanding challenge. Few examples of biological posttranslational oscillators are known, necessitating theoretical progress to determine realizable oscillators. We construct mathematical models for two posttranslational oscillators, using few components that interact only through reversible binding and phosphorylation/dephosphorylation reactions. Our designed oscillators rely on the self-assembly of two protein species into multimeric functional enzymes that respectively inhibit and enhance this self-assembly. We limit our analysis to within experimental constraints, finding (i) significant portions of the restricted parameter space yielding oscillations and (ii) that oscillation periods can be tuned by several orders of magnitude using recent advances in computational protein design. Our work paves the way for the rational design and realization of protein-based dynamic systems.
The essence of turbulent flow is the conveyance of energy through the formation, interaction, and destruction of eddies over a wide range of spatial scales-from the largest scales where energy is injected down to the smallest scales where it is dissipated through viscosity. Currently, there is no mechanistic framework that captures how the interactions of vortices drive this cascade. We show that iterations of the elliptical instability, arising from the interactions between counter-rotating vortices, lead to the emergence of turbulence. We demonstrate how the nonlinear development of the elliptical instability generates an ordered array of antiparallel secondary filaments. The secondary filaments mutually interact, leading to the formation of even smaller tertiary filaments. In experiments and simulations, we observe two and three iterations of this cascade, respectively. Our observations indicate that the elliptical instability could be one of the fundamental mechanisms by which the turbulent cascade develops.
Collagen consists of three peptides twisted together through a periodic array of hydrogen bonds. Here we use this as inspiration to find design rules for programmed specific interactions for self-assembling synthetic collagen like triple helices, starting from disordered configurations. The assembly generically nucleates defects in the triple helix, the characteristics of which can be manipulated by spatially varying the enthalpy of helix formation. Defect formation slows assembly, evoking kinetic pathologies that have been observed to mutations in the primary collagen amino acid sequence. The controlled formation and interaction between defects gives a route for hierarchical self-assembly of bundles of twisted filaments.
The numerical solution of partial differential equations (PDEs) is challenging because of the need to resolve spatiotemporal features over wide length- and timescales. Often, it is computationally intractable to resolve the finest features in the solution. The only recourse is to use approximate coarse-grained representations, which aim to accurately represent long-wavelength dynamics while properly accounting for unresolved small-scale physics. Deriving such coarse-grained equations is notoriously difficult and often ad hoc. Here we introduce data-driven discretization, a method for learning optimized approximations to PDEs based on actual solutions to the known underlying equations. Our approach uses neural networks to estimate spatial derivatives, which are optimized end to end to best satisfy the equations on a low-resolution grid. The resulting numerical methods are remarkably accurate, allowing us to integrate in time a collection of nonlinear equations in 1 spatial dimension at resolutions 4x to 8x coarser than is possible with standard finite-difference methods.
Programmable self-assembly of smart, digital, and structurally complex materials from simple components at size scales from the macro to the nano remains a long-standing goal of material science. Here, we introduce a platform based on magnetic encoding of information to drive programmable self-assembly that works across length scales. Our building blocks consist of panels with different patterns of magnetic dipoles that are capable of specific binding. Because the ratios of the different panel-binding energies are scale-invariant, this approach can, in principle, be applied down to the nanometer scale. Using a centimeter-sized version of these panels, we demonstrate 3 canonical hallmarks of assembly: controlled polymerization of individual building blocks; assembly of 1-dimensional strands made of panels connected by elastic backbones into secondary structures; and hierarchical assembly of 2-dimensional nets into 3-dimensional objects. We envision that magnetic encoding of assembly instructions into primary structures of panels, strands, and nets will lead to the formation of secondary and even tertiary structures that transmit information, act as mechanical elements, or function as machines on scales ranging from the nano to the macro.
The accurate prediction of RNA secondary structure from primary sequence has had enormous impact on research from the past 40 years. Although many algorithms are available to make these predictions, the inclusion of non-nested loops, termed pseudoknots, still poses challenges arising from two main factors: 1) no physical model exists to estimate the loop entropies of complex intramolecular pseudoknots, and 2) their NP-complete enumeration has impeded their study. Here, we address both challenges. First, we develop a polymer physics model that can address arbitrarily complex pseudoknots using only two parameters corresponding to concrete physical quantities-over an order of magnitude fewer than the sparsest state-of-the-art phenomenological methods. Second, by coupling this model to exhaustive enumeration of the set of possible structures, we compute the entire free energy landscape of secondary structures resulting from a primary RNA sequence. We demonstrate that for RNA structures of similar to 80 nucleotides, with minimal heuristics, the complete enumeration of possible secondary structures can be accomplished quickly despite the NP-complete nature of the problem. We further show that despite our loop entropy model's parametric sparsity, it performs better than or on par with previously published methods in predicting both pseudoknotted and non-pseudoknotted structures on a benchmark data set of RNA structures of <= 80 nucleotides. We suggest ways in which the accuracy of the model can be further improved.
Neuronal activity induces changes in blood flow by locally dilating vessels in the brain microvasculature. How can the local dilation of a single vessel increase flow-based metabolite supply, given that flows are globally coupled within microvasculature? Solving the supply dynamics for rat brain microvasculature, we find one parameter regime to dominate physiologically. This regime allows for robust increase in supply independent of the position in the network, which we explain analytically. We show that local coupling of vessels promotes spatially correlated increased supply by dilation.
Deep neural networks have achieved state-of-the-art accuracy at classifying molecules with respect to whether they bind to specific protein targets. A key breakthrough would occur if these models could reveal the fragment pharmacophores that are causally involved in binding. Extracting chemical details of binding from the networks could enable scientific discoveries about the mechanisms of drug actions. However, doing so requires shining light into the black box that is the trained neural network model, a task that has proved difficult across many domains. Here we show how the binding mechanism learned by deep neural network models can be interrogated, using a recently described attribution method. We first work with carefully constructed synthetic datasets, in which the molecular features responsible for ``binding'' are fully known. We find that networks that achieve perfect accuracy on held-out test datasets still learn spurious correlations, and we are able to exploit this nonrobustness to construct adversarial examples that fool the model. This makes these models unreliable for accurately revealing information about the mechanisms of protein-ligand binding. In light of our findings, we prescribe a test that checks whether a hypothesized mechanism can be learned. If the test fails, it indicates that the model must be simplified or regularized and/or that the training dataset requires augmentation.
When vortex rings collide head-on at high enough Reynolds numbers, they ultimately annihilate through a violent interaction which breaks down their cores into a turbulent cloud. We experimentally show that this very strong interaction, which leads to the production of fluid motion at very fine scales, uncovers direct evidence of an iterative cascade of instabilities in a bulk fluid. When the coherent vortex cores approach each other, they deform into tentlike structures and the mutual strain causes them to locally flatten into extremely thin vortex sheets. These sheets then break down into smaller secondary vortex filaments, which themselves rapidly flatten and break down into even smaller tertiary filaments. By performing numerical simulations of the full Navier-Stokes equations, we also resolve one iteration of this instability and highlight the subtle role that viscosity must play in the rupturing of a vortex sheet. The concurrence of this observed iterative cascade of instabilities over various scales with those of recent theoretical predictions could provide a mechanistic framework in which the evolution of turbulent flows can be examined in real time as a series of discrete dynamic instabilities.
Understanding and controlling polyelectrolyte adsorption onto carbon nanotubes is a fundamental challenge in nanotechnology. Polyelectrolytes have been shown to stabilize nanotube suspensions through adsorbing onto the nanotube surface, and polyelectrolyte-coated nanotubes are emerging as building blocks for complex and addressable self-assembly. Conventional wisdom suggests that polyelectrolyte adsorption onto nanotubes is driven by specific chemical or van der Waals interactions. We develop a simple mean-field model and show that ion image attraction significantly effects adsorption onto conducting nanotubes at low salt concentrations. Our theory suggests a simple strategy to selectively and reversibly functionalize carbon nanotubes on the basis of their electronic structures, which in turn modify the ion image attraction.
Creating a selective gel that filters particles based on their interactions is a major goal of nanotechnology, with far-reaching implications from drug delivery to controlling assembly pathways. However, this is particularly difficult when the particles are larger than the gel's characteristic mesh size because such particles cannot passively pass through the gel. Thus, filtering requires the interacting particles to transiently reorganize the gel's internal structure. While significant advances, e.g., in DNA engineering, have enabled the design of nano-materials with programmable interactions, it is not clear what physical principles such a designer gel could exploit to achieve selective permeability. We present an equilibrium mechanism where crosslink binding dynamics are affected by interacting particles such that particle diffusion is enhanced. In addition to revealing specific design rules for manufacturing selective gels, our results have the potential to explain the origin of selective permeability in certain biological materials, including the nuclear pore complex.
A ubiquitous feature of bacterial communities is the existence of spatial structures. These are often coupled to metabolism, whereby the spatial organization can improve chemical reaction efficiency. However, it is not clear whether or how a desired colony configuration, for example, one that optimizes some overall global objective, could be achieved by individual cells that do not have knowledge of their positions or of the states of all other cells. By using a model which consists of cells producing enzymes that catalyze coupled metabolic reactions, we show that simple, local rules can be sufficient for achieving a global, community-level goal. In particular, even though the optimal configuration varies with colony size, we demonstrate that cells regulating their relative enzyme levels based solely on local metabolite concentrations can maintain the desired overall spatial structure during colony growth. We also show that these rules can be very simple and hence easily implemented by cells. Our framework also predicts scenarios where additional signaling mechanisms may be required.
The nasal cavity is a vital component of the respiratory system that heats and humidifies inhaled air in all vertebrates. Despite this common function, the shapes of nasal cavities vary widely across animals. To understand this variability, we here connect nasal geometry to its function by theoretically studying the airflow and the associated scalar exchange that describes heating and humidification. We find that optimal geometries, which have minimal resistance for a given exchange efficiency, have a constant gap width between their side walls, while their overall shape can adhere to the geometric constraints imposed by the head. Our theory explains the geometric variations of natural nasal cavities quantitatively, and we hypothesize that the tradeoff between high exchange efficiency and low resistance to airflow is the main driving force shaping the nasal cavity. Our model further explains why humans, whose nasal cavities evolved to be smaller than expected for their size, become obligate oral breathers in aerobically challenging situations.