Main

Lungs have evolved complex and diverse architectures that combine a large surface with an exquisitely thin barrier for efficient exchange of oxygen and carbon dioxide between air and blood. In mammalian lungs, gas exchange occurs in tightly packed alveoli, the terminal airspaces of the bronchial tree, which are surrounded by walls that contain a dense network of capillaries (Fig. 1a). The discovery of alveoli and their associated capillaries by Malpighi in the seventeenth century inaugurated what became the study of the structural basis of gas exchange, providing the foundation for modern respiratory physiology and pulmonary medicine2,3. Efforts to understand the cellular structure of the barrier have focused mainly on epithelial cells, beginning with the recognition that alveoli are lined by a continuous epithelium composed of intermixed alveolar type 1 (AT1) and AT2 cell types4. AT1 cells are large, thin and highly extended cells that comprise 95% of the respiratory surface across which diffusion occurs, whereas cuboidal AT2 cells secrete surfactant that prevents alveolar collapse1. Although much progress has been made in understanding the development, maintenance and repair of the alveolar epithelium5,6,7, the cells of the alveolar endothelium—the other side of the air–blood barrier—have received less attention.

Fig. 1: Two stable, intermingled alveolar capillary cell types.
figure 1

a, Alveolar capillaries in adult mouse lung immunostained for PECAM1. b, t-distributed stochastic neighbour embedding (t-SNE) plot of endothelial cell populations annotated in scRNA-seq data for adult mouse lung13. c, Heat map of expression of capillary subset markers and the general endothelial marker Cldn5 in individual capillary cells. CPM, counts per million. df, Single-molecule fluorescent in situ hybridization (smFISH) for the capillary subset markers Apln (d, f) or Ednrb and Car4 (aCap) (e), and Aplnr (gCap) (df), in adult mouse lung. Images in d (right) and f show individual aCap and gCap cells. g, Relative abundance of aCap cells, gCap cells and cells that co-express aCap and gCap markers (intermediate (IM) cells) in lungs from 3-month-old (young) and 24-month-old (aged) mice (data shown as mean; n = 500 cells scored per mouse; 2 mice per group). h, i, Co-expression of tdTomato lineage label (asterisks) and aCap marker Ednrb but not gCap marker Aplnr (h), or gCap marker Aplnr but not aCap marker Apln (i), in lungs collected six months after mature aCap (h) or gCap (i) cells were lineage-labelled. Blue, DAPI. Scale bars, 10 μm.

Source data

Intermingled alveolar capillary cell types

We systematically defined the cellular diversity of the pulmonary endothelium in the adult mouse lung by single-cell RNA sequencing (scRNA-seq) and mapping, and identified two molecularly distinct populations of capillary cells in the alveolus (Fig. 1b–d, Extended Data Fig. 1a–g, Supplementary Tables 1, 2). The two subsets are interspersed throughout the gas-exchange region, creating apparently random diversity within the alveolar capillary network, and their relative abundance changes little with age (Fig. 1e–g, Extended Data Fig. 1h–j). To test whether the capillary populations are stable cell types or interconverting cell states, we used complementary genetic strategies to permanently label each subset (using either Apln-creER or Aplnr-creER) and analysed the expression of subset markers in labelled cells after 48 hours, 1 month or 6 months to determine whether the labelled population continues to express the markers, or whether cells turn them off and start to express markers of the other population. We found minimal (0.2% at 6 months) interconversion (Fig. 1h, i, Extended Data Fig. 1k–n), indicating that the populations are not transient cell states. We conclude that the alveolar capillary network is composed of two intermingled, stable cell types, which we call gCap (general capillary cells) and aCap (aerocytes; see below).

Aerocytes are specialized for gas exchange

We used sparse cell labelling and deep imaging to visualize individual capillary cells in three dimensions. aCap cells are complex, large cells (spanning more than 100 μm; 21 × 103 μm3 mean volume) with ramified extensions that surround pores (mean of 6 pores per cell, range 2–9), giving cells the appearance of Swiss cheese (Fig. 2a, c, Extended Data Fig. 2f–h, Supplementary Video 1). The cells have a variety of sizes and shapes, and a single cell frequently spans multiple alveoli. Morphological complexity of this kind has also been described for AT1 cells1.

Fig. 2: Specialized alveolar capillary cell types in gas exchange and capillary renewal.
figure 2

a, b, Single aCap (aerocyte) (a) or gCap (b) cells in adult Apln-creER; Rosa26-Confetti (a) or Aplnr-creER; Rosa26-Confetti (b) lungs. c, Quantification of individual cell volumes. Bar indicates mean (19 aCap and 17 gCap cells scored from n = 2 mice). d, aCap and gCap cells in adult Cdh5-creER; Rosa26-Confetti lung form multicellular tubes (asterisks) within capillaries surrounding a single alveolus (dotted outline). Blue, elastin fibres (a, b, d). e, Schematic of alveolar capillary network. Asterisks, multicellular tubes. RBC, red blood cell. fi, Transmission electron micrographs of adult mouse alveolar walls. f, Thick and thin regions of the air–blood barrier. gi, Apln-creER; Rosa26-tdTomato (g, h) or Aplnr-creER; Rosa26-tdTomato (i) lungs immunostained for tdTomato (heavy black stain). Labelled aerocytes (g, h) but not gCap cells (i) are associated with thin regions (dashed lines). j, Quantification of the percentage of each labelled cell type associated with thick or thin regions (n = 2 mice of each genotype; 21 labelled aCap cells and 24 labelled gCap cells scored). k, Schematic representation of the air–blood barrier. l, m, Analysis of the proliferation of lineage-labelled aCap (l) or gCap (m) cells during adult homeostasis, detected by cumulative EdU incorporation for six weeks. The mean percentage of EdU+ aCap cells (l) or gCap cells (m) is shown at the bottom left (n = 400–4,000 cells scored per lung in n = 2 mice of each genotype). tdT, tdTomato. n, Quantification of the fraction of EdU+ lineage-labelled aCap or gCap cells during the indicated intervals after elastase-induced injury (Inj.) or mock injury with saline as control (Ctrl) (data shown as mean; n = 200–1,600 cells scored per lung in 2–4 mice of each genotype per time point and treatment group; see Methods for exact sample sizes). o, smFISH for lineage label (tdTomato), aCap (Ednrb) and gCap (Aplnr) markers in gCap-lineage-labelled lung six weeks after elastase injury. Blue, DAPI (l, m, o). p, Quantification of the fraction of lineage-labelled gCap (left) or aCap (right) cells expressing aCap (Ednrb), gCap (Aplnr or Ptprb) or aCap and gCap markers (IM) six weeks after elastase injury in injured and uninjured (Ctrl) regions (n = 500–1,000 cells scored per region; 3 injured and uninjured regions scored in n = 2 mice of each genotype). q, Relative abundance of capillary cell types in injured and uninjured (Ctrl) regions, six weeks after elastase administration (n = 800–5,600 cells scored per region; 3 injured and uninjured regions; n = 3 mice). Scale bars, 10 μm (a, b, d, l, m, o); 2 μm (fi).

Source data

gCap cells have a related but less extreme morphology. They are smaller (spanning less than 40 μm; 4 × 103 μm3 mean volume), have fewer pores (mean of 3 pores per cell, range 1–6) and are less extensively branched, rarely spanning multiple alveoli (Fig. 2b, c, Extended Data Fig. 2e, g, h, Supplementary Video 2). The two cell types fit together to form multicellular tubes (Fig. 2d, e, Supplementary Video 3). The mean surface area of aCap cells is four to five times greater than that of gCap cells (Extended Data Fig. 2g), but they are fourfold less abundant (Fig. 1g), hence each contributes about half of the total capillary surface area. The morphologies of both types—especially aCap cells—are distinct from the morphologies of capillary cells elsewhere in the lung, within the bronchial circulation, and in other organs (Extended Data Fig. 3a–g), reflecting the unique architecture and function of the pulmonary circulation. Comparison of their molecular diversity (Extended Data Fig. 3h–l) suggests that capillary cells in other organs are more similar to gCap cells (supporting the name ‘general’ capillary), whereas aCap cells are unique to the lung.

Capillaries are asymmetrically positioned within alveolar walls such that only some of the endothelium is tightly apposed to squamous AT1 cells to form thin regions of the gas-exchange surface in which the barrier to diffusion is minimized, whereas other (‘thick’) regions are separated from the epithelium by stromal cells and connective tissue3 (Fig. 2f). To look for differences in the localization of the cell types in these structurally distinct regions, we performed immuno-electron microscopy on lungs in which aCap and gCap cells were separately labelled. We found that thin regions are composed entirely of aCap cells, whereas gCap cells are positioned in contact with stromal cells in thick regions (Fig. 2g–j, Extended Data Fig. 4a–c). Because of their close association with AT1 cells within thin regions of the respiratory surface (Fig. 2k) and their expansive morphology, which reflects a specialized role in gas exchange analogous to AT1 cells, we term aCap cells ‘aerocytes’.

gCap cells are capillary stem cells

Little is known about how alveolar capillaries are maintained throughout life and repaired after alveolar damage7. To examine the behaviours of the capillary cell types in alveolar homeostasis, we first analysed proliferation by cumulative labelling with 5-ethynyl-2′-deoxyuridine (EdU) for six weeks in mice in which either gCap cells or aerocytes were genetically labelled. Capillary cell turnover was slow8 but, notably, proliferation was almost entirely restricted to gCap cells (7.7% EdU+ gCap cells; Fig. 2l, m). We detected extremely rare, solitary EdU+ aerocytes (2 of 4,401 cells), whereas EdU+ gCap cells were present as clusters of up to 10 cells, indicative of focal proliferation.

Acute lung injury can induce the proliferation of alveolar capillary cells8,9. To investigate the role of the cell types in capillary repair, we used a mouse model of emphysema10 (Extended Data Fig. 5a, b) in which elastase-induced alveolar damage is accompanied, we found, by robust capillary cell proliferation. Lineage-labelled gCap cells proliferated as early as day 3 after elastase instillation (17% EdU+), with almost all (93%) of the gCap cells in injured regions being EdU+ at 6 weeks (Fig. 2n, Extended Data Fig. 5c–f). Aerocytes rarely proliferated even after injury (0.2% EdU+ at 3 days, 1.7% at 6 weeks).

We examined the fate of lineage-labelled gCap cells after injury and found labelled aerocytes as well as gCap cells, demonstrating that aerocytes are generated from gCap cells during repair (Fig. 2o, p). We also detected rare, lineage-labelled aerocytes in the absence of injury, after extended chases (3.4% lineage-labelled aCap cells at 14 months; Extended Data Fig. 1n)—indicating that aerocytes are generated intermittently from gCap cells during homeostasis. We conclude that gCap cells function as specialized stem/progenitor cells that replenish the alveolar capillary endothelium during maintenance and repair.

Even after six weeks of recovery from injury, the cellular composition of the alveolar capillary network is altered (Fig. 2q), suggesting that repair is abnormal or incomplete at this stage. Aberrant or insufficient repair may underlie vascular changes in lung diseases such as emphysema and interstitial lung disease, as well as respiratory distress syndromes that accompany severe injury or virus-induced alveolar damage—as in coronavirus disease 2019 (COVID-19)11,12. Understanding the behaviour of gCap cells, and the signals that activate their proliferation and reprogramming, may offer a strategy to restore the normal pattern.

Molecular functions of capillary cell types

We used scRNA-seq profiles13 to discover common and additional type-specific functions of capillary cells. We found only a small number of genes (Scn7a, Mapt) that were expressed by all (or most), and only, alveolar capillary cells, suggesting that few if any molecular functions are carried out by both capillary cell types but not by other lung endothelial cells (Extended Data Fig. 1a, b). By contrast, we identified many genes with roles in physiology, immune interactions and signalling, the expression of which differed between the cell types, revealing further specialization (Fig. 3a, Extended Data Fig. 4d, Supplementary Table 2).

Fig. 3: Molecular functions of alveolar capillary cell types.
figure 3

a, Dot plots showing the expression of selected differentially expressed genes in mouse aerocytes and gCap cells. bf, Diagrams of proposed specialized functions of alveolar capillary cell types. Aerocyte genes, blue; gCap genes, green; aerocyte and gCap genes, yellow; pericyte genes16, purple. b, Leukocyte trafficking. c, Antigen presentation. TCR, T cell receptor. d, Vasomotor control. gCap cells express EDN1, eNOS (encoded by Nos3) and PTGIS, making them a unique source of vasomodulators. EDN1 can signal to EDNRA (expressed on pericytes) or to EDNRB (expressed on aerocytes), which may feed back to gCap cells (dashed arrow) to regulate vasodilator production. IP3, inositol trisphosphate; NO, nitric oxide; PGI2, prostaglandin I2. e, Haemostasis. Roman numerals indicate coagulation factors. TF, tissue factor. APC, activated protein C. Pro, procoagulants; anti, anticoagulants. f, Lipid metabolism. Lipoprotein lipase (LPL), anchored to the lumen of gCap cells by GPIHBP1, converts circulating lipoprotein triglycerides to monoglycerides, which are broken down to free fatty acids by monoacylglycerol lipase (MGLL) expressed by aerocytes. HDL, high-density lipoprotein; HDL–FC, HDL bound to free cholesterol; S1P, sphingosine 1-phosphate; SphK, sphingosine kinase.

Some functions appear to be unique to one cell type. Aerocytes are the likely site of leukocyte trafficking—which is primarily a capillary function in the lung14—as they specifically express adhesion and leukocyte-sequestration genes (Fig. 3a, b). gCap cells, in contrast, express genes that encode MHC class II components, suggesting that they present antigens (Fig. 3a, c). gCap cells may also have a specialized role in vasomotor control (see below; Fig. 3a, d). Other functions appear to be distributed across both cell types. The two cell types produce distinct pro- and anticoagulants, suggesting that they have different roles in haemostasis (Fig. 3a, e), and they may cooperate in lipid metabolism, forming an ‘assembly line’ that produces fatty acids (Fig. 3a, f).

Our analysis also revealed that aerocytes and gCap cells can signal to one another. Aerocytes are a source of ligands (for example, apelin (encoded by Apln), kit ligand (Kitl)) that signal through cognate receptors (Aplnr, Kit) that are displayed by gCap cells; conversely, gCap cells produce ligands (for example, endothelin 1 (Edn1), vascular endothelial growth factor A (Vegfa)) with cognate receptors (Ednrb, Kdr) on aerocytes (Fig. 3a, Extended Data Fig. 4d, e). Such bidirectional signalling indicates that the two cell types can regulate each other.

We also identified specialized signalling interactions with other alveolar cell types. Aerocytes express Ednrb and Kdr, suggesting that they interact with AT1 cells (Fig. 3a, Extended Data Fig. 4d, e). gCap cells express a vasoconstrictor (Edn1) that can signal to endothelin receptor type A (Ednra) on pericytes; they also express endothelial nitric oxide synthase (Nos3) and prostaglandin I2 synthase (Ptgis), making them a source of vasodilators (Fig. 3a, d, Extended Data Fig. 4d, e). This indicates that gCap cells regulate vasomotor tone through interactions with pericytes. These specialized signalling relationships reflect distinct associations of the capillary cell types with surrounding cells, revealing functional compartmentalization within the alveolus (Extended Data Fig. 4f).

Development and ageing of alveolar capillaries

To determine when the capillary cell types emerge during development, we investigated their origin in the embryonic lung, in which a dense vascular plexus surrounds branching airways (Fig. 4a). Using lineage tracing, we found that this plexus—which is composed of small (2 × 103 μm3 mean volume), simple, proliferating endothelial cells—gives rise to both subsets of alveolar capillary cells (Fig. 4b–d, Extended Data Fig. 6a–c, Supplementary Data 1, 2). The near-complete labelling of the capillary network suggests that the plexus is the major or sole source of aerocytes and gCap cells. To determine whether individual plexus cells can give rise to both capillary cell types, we performed a clonal analysis. Clones contained both aerocytes and gCap cells, demonstrating that plexus cells are bipotent (Fig. 4e, f, Extended Data Fig. 6d–f, Supplementary Video 4).

Fig. 4: Development and evolution of specialized alveolar capillary cell types.
figure 4

a, Plexus surrounding airways in immunostained E13.5 mouse lung. b, Sparsely labelled plexus cells in E12.5 Aplnr-creER; Rosa26-Confetti lung immunostained for red fluorescent protein (RFP). c, Near-complete labelling of alveolar capillaries in postnatal day (P)60 Aplnr-creER; Rosa26-tdTomato lungs from mice that received tamoxifen at E12.5. Endomucin, green; tdTomato, red. d, smFISH to detect tdTomato (white) in aCap (Apln, red) and gCap (Aplnr, green) cells in P60 Aplnr-creER; Rosa26-tdTomato lung that was lineage-labelled at E12.5. e, Clone in P25 Aplnr-creER; Rosa26-Confetti lung composed of aCap and gCap cells derived from a single yellow fluorescent protein (YFP)-expressing plexus cell that was labelled at E14.5. f, Schematics depicting the origin of both alveolar capillary cell types from single bipotent cells in the embryonic plexus. g, Individual aerocytes in Apln-creER; Rosa26-Confetti lungs at the indicated stages. Dots, pores. h, Quantification of individual cell volumes at the indicated stages (n > 10 cells scored for each cell type at each time point from n = 2 mice; see Methods for exact cell number; adult aCap and gCap cells from Fig. 2c). Bar, mean value. i, Quantification of Vwf-expressing capillary cells in lungs from 3-month-old (young) and 24-month old (aged) mice (mean ± s.d.; n > 500 cells scored per mouse; 3 mice per age group). j, smFISH for capillary cell-type markers in alveolar capillary cells in lung tissue from a 75-year-old man. k, Heat map of expression of cell-type markers in individual human capillary cells16. UP10K, unique molecular identifiers per ten thousand. l, Human adenocarcinoma vessel, containing cells that co-express EDNRB (aCap marker) and PTPRB (gCap marker). See also Extended Data Fig. 9a. mo, Dot plots showing expression in aCap and gCap cells for selected conserved genes (type 0, leukocyte trafficking; m); genes with species-specific specialized expression (type 1, haemostasis; n), or genes that switch cell type between species (mouse13, human16) (type 2, antigen presentation; o). See Extended Data Fig. 10c–e. p, Schematic depicting alligator lung faveolus. q, Faveolar capillary network in alligator lung immunostained for claudin 5 (CLDN5). r, Co-expression (asterisks) of the mammalian alveolar capillary cell-type markers EDNRB (aCap) and APLNR (gCap) in faveolar capillary cells (CLDN5, white) in alligator lung. Blue, DAPI (d, j, l, r) or elastin fibres (e, g). Scale bars, 10 μm.

Source data

Aerocytes first emerge at embryonic day (E)17.5 and begin to acquire their Swiss-cheese-like morphology during embryonic development (Fig. 4g, h, Extended Data Figs. 2, 6g, h). This is consistent with a recent study that identified emerging aerocytes and characterized their gene expression and morphology15. The emergence of capillary cell types is gradual and asynchronous, and both cell types continue to mature molecularly and morphologically after birth (Fig. 4g, h, Extended Data Figs. 2, 6h–j, 7, Supplementary Data 3, 4). These results reveal a remarkable transformation of the plexus into the alveolar capillary network, beginning in the embryonic lung.

Endothelial cell phenotypes change with age, and may contribute to age-related disease. We detected a widespread induction of von Willebrand factor (encoded by Vwf), which is considered a marker of endothelial dysfunction, in the lungs of aged mice. Vwf is induced specifically in gCap cells, but not in aerocytes (Fig. 4i, Extended Data Fig. 6k–m), indicating that the cell types are differentially regulated during ageing.

Human alveolar capillary cell types

We identified both cell types intermingled within human alveolar capillary networks (Fig. 4j, k, Extended Data Fig. 8a–i, Supplementary Video 5), indicating these cell types have been conserved in mammalian evolution16. Similarly to mice, aerocytes emerge during embryonic development in humans (Extended Data Fig. 8j).

The mosaic pattern of capillary cells, however, is lost or altered in lung tumours. In human adenocarcinoma vessels, we observed abundant intermediate cells that co-express markers of both cell types (Fig. 4l, Extended Data Fig. 9a). Cell composition is also altered in mouse adenomas; tumour vessels are composed of gCap cells and intermediate cells, with few or no aerocytes (Extended Data Fig. 9b–f).

As in mice, we identified genes with key roles in physiology, immune interactions and signalling that exhibited specialized expression in the human cell types (Extended Data Fig. 10a, b, Supplementary Table 3). Many genes with cell-type specificity in mice are also expressed by the corresponding human cell type16 (Fig. 4m, Extended Data Fig. 10c). However, we also identified mouse–human differences. Some genes show specialized expression in only one species, including genes involved in functions that are distributed between the cell types (Fig. 4n, Extended Data Fig. 10d). For other genes, the cell type with specialized expression switches between mouse and human, presumably reflecting species-specific functional differences (Fig. 4o, Extended Data Fig. 10e).

Our analysis suggests that some cell-type-specific functions are conserved between mice and humans. For example, specialized leukocyte-trafficking genes are restricted to aerocytes, and gCap cells may regulate vasomotor tone in both species (Fig. 4m, Extended Data Fig. 10a–c). But capillary cell types can also gain (or lose) functions: in mice, gCap cells preferentially express MHC class II genes, whereas in humans, these genes are expressed by aerocytes (Fig. 4o, Extended Data Fig. 10a, b, e).

Evolution of capillary cell specialization

To investigate the evolutionary origins of the cell types, we examined capillary cell diversity in lungs from the American alligator (Alligator mississippiensis) and the western painted turtle (Chrysemys picta bellii)—reptiles from distinct phylogenetic groups (Extended Data Fig. 11a). Alligator and turtle respiratory systems are, in many ways, representative of the ancestral amniote condition17, and gas exchange occurs across a thick air–blood barrier in faveoli, which are surrounded by capillary nets that resemble those of mouse and human alveoli (Fig. 4p, q, Extended Data Fig. 11b–d, g–k, m, Supplementary Video 6). We found that in each species, lung capillary cells express markers of both mammalian cell types (Fig. 4r, Extended Data Fig. 11e, f, l). However, in contrast to alveolar capillary cells, alligator and turtle faveolar capillary cells co-express these cell-type markers, suggesting that they may lack the cell specialization of mammalian lungs.

Discussion

Here we show that the alveolar capillary endothelium, like the alveolar epithelium, is composed of two intermingled cell types. Such cell-type specialization may have evolved to optimize gas exchange within the complex environment of the alveolus, balancing and integrating structure and function. Aerocytes and AT1 cells are both large, complex cells that are tightly apposed in the thinnest regions of the gas-exchange surface—separated only by a shared, compositionally unique basement membrane18—which facilitates diffusion. This specialized interface may be particularly important in lung injury. In pulmonary oedema, seen early in acute respiratory distress syndrome (ARDS), fluid accumulates in thick regions, initially protecting thin regions and preserving gas exchange19,20. The alveolar capillary cell types arise—like the epithelium—from bipotent progenitors, through distinct maturation programs6. Aerocytes first emerge as AT1 differentiation begins, highlighting coordination between the two cell types critical for gas exchange15. During adult life, the alveolar endothelium is maintained and repaired by gCap cells, which—like AT2 cells—are ‘bifunctional’ stem/progenitor cells5,6 that also serve physiological functions. Separating the progenitor function from aerocytes and AT1 cells may be another mechanism for preserving the gas-exchange surface. Capillary changes could underlie pathologies in the lung (Fig. 4i, l, Extended Data Fig. 9) and other organs, making it essential to now explore and map the full heterogeneity of capillary cell types, states and specializations in health, ageing and disease; to identify changes in capillary composition, which cell types change and how they change; and to investigate the consequences of capillary changes for organ function.

Methods

Mice

The following mouse strains were used: C57BL/6 (C57BL/6NCrl, Charles River Laboratories, strain 027) was the wild-type strain. Apln-creER (Aplntm1.1(cre/ERT2)Bzsh)21 (provided by B. Zhou), Aplnr-creER (Tg(Aplnr-cre/ERT2)#Krh)22) (provided by K. Red-Horse), Cdh5-creER (Tg(Cdh5-cre/ERT2)1Rha)23 (provided by R. Adams) and Sftpc-creER (Sftpctm1(cre/ERT2,rtTA)Hap)24 (provided by H. Chapman) were used for conditional expression of Cre recombinase. Rosa26-tdTomato (Gt(ROSA)26Sortm14(CAG-tdTomato)Hze)25 (The Jackson Laboratory, strain 007914), which expresses cytoplasmic tdTomato after recombination, and Rosa26-Confetti (Gt(ROSA)26Sortm1(CAG-Brainbow2.1)Cle)26 (The Jackson Laboratory, strain 017492), which expresses membrane targeted Cerulean CFP, nuclear GFP, cytoplasmic EYFP or cytoplasmic RFP after recombination, were used as Cre reporters. KrasLSL-G12D (Krastm4Tyj/J)27 (The Jackson Laboratory, strain 008179) was used to express a constitutively active form of KRAS from the endogenous locus after Cre-mediated recombination. All experimental mice and embryos were heterozygous (or hemizygous) for indicated alleles. Only female mice and embryos were used for experiments with Apelin-creER, as Apelin is X-linked21. Noon of the day a vaginal plug was detected was considered as E0.5. The day a litter was born was considered as P0. For induction of Cre recombinase activity, tamoxifen (Sigma, T5648) was dissolved in corn oil and administered by intraperitoneal (i.p.) injection unless otherwise noted. Adult lungs were perfused, inflated with 2% low melting point agarose (Invitrogen), and collected as previously described6. Heart, brain, small intestine, thyroid and kidney were collected after perfusion of the left heart with Ca2+- and Mg2+-free phosphate-buffered saline, pH 7.4 (PBS; Gibco). Postnatal (P7) retinas28 and embryonic lungs29 were collected and prepared as previously described.

Tamoxifen dose, administration and tissue collection schedules for individual experiments (unless noted elsewhere) were as follows:

For cell-type stability pulse-chase experiments (presented in Fig. 1h, i, Extended Data Fig. 1k–n): 4 mg tamoxifen to adults, lungs collected after 48 h, 1, 6 or 14 months.

For cell morphology (sparse labelling) experiments: 2 mg (Fig. 2a, b, Extended Data Fig. 2e, f), 0.5 mg (Extended Data Fig. 3a) or 1 mg (Fig. 2d, Extended Data Fig. 3b–f) tamoxifen to adults, collected after 5–7 days; 2 mg (Fig. 4b) or 0.5 mg (Extended Data Fig. 2a) tamoxifen to pregnant dams at E11.5, collected at E12.5, 2 mg tamoxifen at E17.5, collected at E18.5 (Fig. 4g, Extended Data Fig. 2b), 0.5 mg tamoxifen at E18.5, collected at P0 (Extended Data Fig. 2c) or 3 mg tamoxifen at E18.5, collected at P0 (Extended Data Fig. 2d); 0.2 mg tamoxifen at P5 by intragastric injection, collected at P7 (Extended Data Fig. 3g) or 0.05 mg tamoxifen by intragastric injection at P5, collected at P7 or P14 (Fig. 4g).

For lineage-tracing experiments: 4 mg tamoxifen administered to pregnant dams; lungs collected from progeny at P21 or P60 (Fig. 4c, d, Extended Data Fig. 6c, Supplementary Data 1); 0.5 mg tamoxifen administered at P7, collected at P21 (Extended Data Fig. 6f).

For maximal labelling experiments: two or three 4 mg tamoxifen doses (administered 48 h apart) to Apln-creER; Rosa26-tdTomato (Fig. 2g, h, Extended Data Fig. 4a, b) or Aplnr-creER; Rosa26-tdTomato (Fig. 2i, Extended Data Fig. 4a, c) adult mice, collected 5–14 days after the first dose.

To induce adenoma formation: 4 mg tamoxifen to Sftpc-creER;KrasLSL-G12D/+ adult mice (Extended Data Fig. 9b, c, f), collected three weeks later.

Mice were housed and bred in the animal facility at Stanford University in accordance with Institutional Animal Care and Use Committee guidance, and were maintained on a 12-h light–dark cycle with food and water provided ad libitum. Adult mice were 2–6 months old, unless otherwise noted. All mouse experiments were approved by the Stanford University Institutional Animal Care and Use Committee.

Human tissue

De-identified healthy human adult lung tissue from 69- and 75-year-old men and a 66-year-old woman was obtained from the Stanford Tissue Bank. De-identified aborted human fetal lung tissue (17 and 23 weeks) was obtained in collaboration with the Stanford Family Planning Research Team, Department of Obstetrics and Gynecology, Division of Family Planning Services and Research, Stanford University School of Medicine. De-identified human tissue representing well-differentiated invasive lung adenocarcinoma (confirmed by pathological evaluation by S.Y.T.) from a 41-year-old woman was obtained from archival diagnostic material in collaboration with the Stanford Department of Pathology, Stanford University School of Medicine. Tissue collection and use in research were approved by the Stanford Institutional Review Board.

Alligators

Lungs were collected from juvenile (body mass, 1.3 kg) and adult (body mass, 14 kg) American alligators (A. mississippiensis, Daudin; male), acquired from the Rockefeller Wildlife Refuge. Lungs were inflated with sterile PBS or 10% formalin for smFISH, or 4% paraformaldehyde (PFA; Electron Microscopy Sciences (EMS)) in PBS for immunostaining. Experiments were approved by the University of Utah Animal Care and Use Committee.

Turtles

Lungs were collected from two adult (body mass, 262 g and 281 g) western painted turtles (C. p. bellii; male; The Turtle Source). Lungs were inflated with 10% formalin for smFISH, or 4% PFA in PBS for immunostaining. Experiments were approved by the University of Utah Institutional Animal Care and Use Committee.

Immunostaining

Immunostaining was performed using previously published protocols6,29 with modifications for adult mouse, human, alligator and turtle tissues as described below. Adult mouse, alligator and turtle lungs and human lung tissue pieces were fixed in 4% PFA in PBS at 4 °C for 2–3 h, then dehydrated through a PBS and methanol series into 100% methanol and stored at −20 °C. Immediately before sectioning, tissue was rehydrated through a methanol and PBT (PBT: 0.1% Tween-20 in PBS) series into PBT. Sections (350 μm) were cut from adult mouse lung lobes on a vibratome (Leica Biosystems). Alligator and turtle lungs, and human lung pieces, were manually cut with a platinum coated double-edge razor blade (EMS) into rough sections 0.5–3 mm thick. Sections were incubated with primary antibody for three nights and secondary antibody for two nights. Sections stained using peroxidase-conjugated secondary antibodies were incubated in tyramide reagents (Perkin Elmer; 1:100) for 45 min. Stained sections were post-fixed in 4% PFA in PBS at 4 °C for 1 h, dehydrated into methanol and cleared in benzyl alcohol:benzyl benzoate (1:2; BABB), or cleared in Vectashield (Vector Laboratories) for confocal imaging.

Immunostaining of the human adenocarcinoma sample was performed on a formalin-fixed paraffin-embedded tumour section using the BOND automated staining system with ER2 epitope retrieval solution and the BOND Polymer Refine Detection system (Leica Biosystems), which includes a haematoxylin counterstain. Adjacent sections were used for immunostaining and smFISH.

Primary antibodies used, at indicated concentrations, were: CD34 (BD Biosciences, 347660; 1:160); claudin 5 (Abcam, ab53765; 1:300); E-cadherin (BD Biosciences, 610181; 1:100); endomucin (Invitrogen, eBioV.7C7, 14-5851-82; 1:300); integrin α8 (R&D, AF4076; reconstituted to 1 mg/ml, used at 1:500); PECAM1 (rat anti-mouse; BD Biosciences, 553370; 1:5,000 for staining embryonic lung, 1:500 for staining adult lung); PECAM1 (mouse anti-human; R&D, BBA7; reconstituted to 0.5 mg/ml in PBS, used at 1:200); tdTomato (Rockland, 600-401-379; 1:300); and VE-cadherin (R&D, AF938; reconstituted to 0.5 mg/ml in PBS, used at 1:300).

Secondary antibodies used, at indicated concentrations, were: donkey anti-goat IgG, Alexa Fluor 568 conjugated (Invitrogen, A11057; 1:250); horse anti-mouse IgG, peroxidase conjugated (Vector Laboratories, PI-2000; 1:150); goat anti-rabbit IgG, peroxidase conjugated (Vector Laboratories, PI-1000; 1:125-1:250); goat anti-rabbit IgG, Alexa 568 conjugated (Invitrogen, A11036; 1:250); goat anti-rat IgG, Alexa 488 conjugated (Invitrogen, A11006; 1:250), for embryonic lung; donkey, anti-rat IgG, Alexa 647 conjugated (Jackson Immunoresearch, 712-605-153; 1:250); goat anti-rat IgG, biotin conjugated (Vector Laboratories, BA-9401; 1:250), for embryonic lung; goat anti-rat IgG, peroxidase conjugated (Vector Laboratories, PI-9401; 1:250), for adult lung. DAPI (4′,6-diamidino-2-phenylindole, dihydrochloride, Invitrogen, D1306; reconstituted in PBS, used at 2 μg/ml), to stain nuclei, and/or Alexa Fluor 350 hydrazide (Invitrogen, A10439; reconstituted to 0.5 mg/ml in PBS, used at 1:100) or Alexa Fluor 633 hydrazide (Invitrogen, A30634; reconstituted to 0.5 mg/ml in PBS, used at 1:500–1:1,000), to visualize elastin fibres, were added along with secondary antibody.

smFISH

Mouse, alligator and turtle lungs, inflated as described above, human lung or adult mouse kidney tissue were fixed in 10% neutral buffered formalin (Fisher Scientific) for 24 h at room temperature and transferred to 70% ethanol (made up in PBS) following 3 brief washes in PBS for embedding in paraffin. Sections were cut at 6 μm. smFISH was performed using a proprietary high-sensitivity RNA amplification and detection technology (RNAscope, Advanced Cell Diagnostics), according to the manufacturer’s instructions using the indicated proprietary probes, the RNAscope Multiplex Fluorescent Reagent Kit (v.2) and TSA Plus reagents (Perkin Elmer; 1:500 dilution for Cy3 and Cy5 dyes, 1:250 dilution for FITC) or Opal dyes (Akoya Biosciences, 1:500 dilution for Opal 570 and 620 dyes, 1:250 dilution for Opal 520 and 690 dyes). After smFISH, sections were incubated in DAPI (used at 2 μg/ml in PBS) for 5 min to counterstain nuclei and mounted in Prolong Gold antifade reagent (Invitrogen). Proprietary (Advanced Cell Diagnostics) probes used were: mouse, Mm-Aplnr (436171), Mm-Vwf (499111), Mm-Ednrb (473801, 473801-C2, 473801-C3), Mm-Apln (415371-C2), Mm-Ptprb (481391-C2), Mm-H2-Ab1 (414731-C2), Mm-Car4 (468421-C3), Mm-Gpihbp1 (540631-C3), Mm-Cldn5 (491611-C3), Mm-Pecam1 (316721-C3), tdTomato (317041-C3); human, Hs-EDN1 (459381), Hs-PTPRB (588141), Hs-CA4 (438561), Hs-EDNRB (528301, 528301-C2), Hs-CLDN5 (517141-C2, 517141-C3), Hs-VWF (560461-C3), Hs-APLN (449971-C3); alligator, Ami-APLNR (576071), Ami-PTPRB (828711), Ami-EDNRB (576081-C2), Ami-CA4 (828621-C2), Ami-CLDN5 (576091-C3); western painted turtle, Cpi-APLNR (828481), Cpi-EDNRB (828471-C2), Cpi-CLDN5 (828461-C3).

For quantification of capillary cell-type abundance, aCap and gCap cells were detected using probes for Ednrb or Ptprb, respectively, and the pan-endothelial probe Cldn5. Cldn5-expressing alveolar cells with 2 or more Ednrb puncta and 0–1 Ptprb puncta were classified as aCap; cells with 2 or more Ptprb puncta and 0–1 Ednrb puncta were classified as gCap; and cells with 2 or more Ednrb and 2 or more Ptprb puncta were classified as capillary intermediate (IM) cells. A total of 500 Cldn5-expressing alveolar cells were scored per lung in 5–10 random fields of view taken with a Plan-Apochromat 25× objective (Carl Zeiss Microscopy), using Volocity software (Quorum Technologies). For quantification of capillary cell-type distribution, capillary cells were scored in the last generation of alveoli immediately adjacent to the pleura and in intra-acinar regions of left and right cranial lobes from 3-month-old mice. For quantification of capillary cell-type abundance in mouse adenomas, capillary cells were scored in sections from tumours with intra-acinar (rather than pleural) location and round, compact morphologies with clear boundaries between tumour and surrounding alveolar tissue. For quantification of Vwf  induction with age, Ptprb+Cldn5+ alveolar cells were classified as gCap and PtprbCldn5+ alveolar cells were classified as aCap. Cells with three or more Vwf puncta were scored as positive. For quantification of aerocyte emergence in the fetal human lung, CA4+APLN+EDNRB+ triple-positive cells with five or more puncta for each transcript were classified as emerging aerocytes. For quantification of capillary fate conversion upon elastase injury, aCap and gCap cells were detected using probes for Ednrb (aCap) and Ptprb or Aplnr (gCap) in injured areas, identified as regions with enlarged airspaces and remodelled elastin fibres, in 3–4-month-old Apln-creER; Rosa26-tdTomato and Aplnr-creER; Rosa26-tdTomato lungs collected six weeks after elastase instillation. For quantification of capillary cell-type abundance in the human lung, CLDN5+ cells with 2 or more EDNRB puncta and 0–1 PTPRB (or EDN1) puncta were classified as aCap; CLDN5+ cells with 2 or more PTPRB (or EDN1) puncta and 0–1 EDNRB puncta were classified as gCap; and CLDN5+ cells with 2 or more EDNRB and 2 or more PTPRB (or EDN1) puncta were classified as ‘IM’.

Histology

Haematoxylin and eosin (H&E) staining was performed using standard protocols on formalin-fixed, paraffin-embedded alligator, turtle and fetal human lung tissue, processed as described above for smFISH. Adjacent sections were used for H&E staining and smFISH.

Electron microscopy and ultrastructural analysis

To visualize capillaries within alveolar walls, perfused and inflated adult mouse lungs were fixed in 2% glutaraldehyde in PBS for 1 h at room temperature. Tissue was manually cut with a platinum-coated double-edge razor blade (EMS) into rough sections. Samples were post-fixed in Karnovsky’s fixative (2% glutaraldehyde (EMS) and 4% PFA (EMS) in 0.1M sodium cacodylate (EMS) pH 7.4) for 1 h, and incubated in cold aqueous 1% osmium tetroxide (EMS), washed, stained in 1% uranyl acetate for 2 h, dehydrated into 100% ethanol, infiltrated with Embed 812 resin (EMS) and cured at 65 °C overnight. Sections (75–90 nm) were collected on formvar/carbon-coated slot copper grids, observed in the JEM-1400 transmission electron microscope (JEOL) with a 120-kV beam and imaged with an Orius SC1000 (Gatan) digital camera.

For immuno-electron microscopy, perfused and inflated Apln-creER; Rosa26-tdTomato and Aplnr-creER; Rosa26-tdTomato adult lungs, were fixed in 4% PFA and 0.1% glutaraldehyde (EMS) in PBS for 3 h at 4 °C. Sections (200 μm) were cut on a vibratome and then immunostained for tdTomato as described above. DAB (3,3′-diaminobenzidine)–nickel (Vector Laboratories, SK-4100) was used as a substrate for the peroxidase conjugated to the secondary antibody. Sections were incubated in DAB–nickel working solution (prepared following the manufacturer’s instructions) for 6–15 min at room temperature. After washing, samples were processed for electron microscopy as described, omitting uranyl acetate staining. To determine the percentage of aerocytes or gCap cells associated with thin or thick regions of the air–blood barrier, capillaries with complete lumens and containing immunolabelled endothelial cells were identified on sections viewed by electron microscopy. Labelled endothelial cells (n = 2 mice of each genotype, 21 labelled aCap cells and 24 labelled gCap cells) were scored as being associated either with the thin region (defined as regions in which the endothelial cell is tightly apposed to the epithelium) or the thick region (defined as regions in which the endothelial cell is clearly separated from the epithelium by stromal cells or connective tissue fibres). Samples were observed by electron microscopy at multiple magnifications to confirm endothelial cell labelling, AT1 cell identity and separation between endothelium and epithelium. As a control, we also scored the association of unlabelled capillary cells. Notably, we found that in capillaries containing labelled gCap cells, all endothelial cells associated with thin regions (n = 12 scored) were unlabelled, consistent with the conclusion that only aCap cells are associated with thin regions. Some sections were scored by an investigator blinded to the genotype of the sample, and similar results were obtained. Representative electron micrographs (Fig. 2g–i) were pseudocoloured in Adobe Illustrator (Extended Data Fig. 4a).

Pulse-chase labelling experiments

To determine the stability of the two alveolar capillary cell populations in the adult mouse lung, aCap cells were labelled using the Apln-creER knock-in allele combined with the Rosa26-tdTomato Cre reporter and gCap cells were labelled with the Aplnr-creER bacterial artificial chromosome (BAC) transgenic allele combined with the Rosa26-tdTomato Cre reporter. Co-expression of tdTomato and a marker for the respective capillary populations (Apln or Ednrb for aCap, Aplnr or Ptprb for gCap) was detected after a 1-, 6- or 14-month chase by smFISH. Around 500–2,000 lineage-labelled cells were scored per lung in at least 3 random fields of view taken with a Plan-Apochromat 25× oil objective (Carl Zeiss Microscopy), using Volocity software (Quorum Technologies).

The fidelity of Apln-creER was confirmed by dosing an Apln-creER; Rosa26-tdTomato mouse with 4 mg tamoxifen followed by a 48-h chase. Tamoxifen-dependent Cre recombination was observed only in Ednrb-expressing cells (n = 604 scored cells; 598 cells co-expressed tdTomato and Ednrb, but not Aplnr; 6 cells co-expressed tdTomato, Ednrb and Aplnr. Because the Apln-creER knock-in allele is a loss of function allele, we used Ednrb rather than Apln as an aCap marker).

The fidelity of Aplnr-creER was established by dosing an Aplnr-creER; Rosa26-tdTomato mouse with 4 mg tamoxifen followed by a 48-h chase. Tamoxifen-dependent Cre recombination was observed only in Aplnr-expressing cells (n = 1,879 scored cells; 1,860 cells co-expressed tdTomato and Aplnr, but not Apln; 19 cells co-expressed tdTomato, Aplnr and Apln).

Sparse labelling of endothelial cells and analysis of cell morphology

To visualize single endothelial cells in the lung and other organs, mice carrying inducible creER alleles were administered limiting doses of tamoxifen (see ‘Mice’ for details). Organs were collected as described, fixed in 2% PFA in PBS for 5 h at 4 °C and cut into 200–250 μm sections on a vibratome. To preserve endogenous fluorescence for imaging, tissue was not dehydrated into methanol. For some experiments, lung sections were stained with Alexa Fluor 350 hydrazide (Invitrogen, A10439; 1:100) or Alexa Fluor 633 hydrazide (Invitrogen, A30634; 1:1,000–1:5,000) to visualize elastin fibres30. To visualize the vasculature, tamoxifen-dosed Cdh5-creER; Rosa26-Confetti mice were injected with 0.2 ml DyLight 649-labelled Lycopersicon esculentum (Tomato) lectin (Vector Laboratories, DL-1178; 1 mg/ml) and humanely euthanized after 5 min. Sections were cleared and mounted in CUBIC131 for confocal imaging. Sparse labelling, with individual fluorescent cells well separated from other cells, was verified. Volume and surface area of individual plexus cells (n = 18 cells at E12.5 from n = 2 mice), aCap cells (n = 23 cells at P0; n = 14 (volume and surface area); n = 16 (pores) at P7; n = 19 cells at 4 months from n = 2 mice) and gCap cells (n = 12 cells at P0; n = 11 cells at P7; n = 17 cells at 4 months from n = 2 mice) were automatically calculated from computed three-dimensional (3D) surfaces using Imaris software (Bitplane). Pore number was determined by counting using the original confocal stack viewed in 3D in Imaris.

Clonal analysis

Lungs from P25 Aplnr-creER; Rosa26-Confetti mice (n = 2), the dams of which were administered limiting doses of tamoxifen (0.06 mg administered by i.p. injection) at E14.5, were collected as described in ‘Mice’, fixed in 4% PFA in PBS for 2 h at 4 °C and cut into 300-μm sections on a vibratome. Sections were stained with Alexa Fluor 633 hydrazide to visualize elastin fibres as above. To preserve endogenous fluorescence for imaging, sections were cleared and mounted in CUBIC131. The tamoxifen dose was chosen to yield small numbers of well-separated clones for each fluorescent reporter, such that some sections did not contain any clones. Only YFP- and RFP-expressing clones were analysed, because cell number and type could not be identified in nuclear GFP- and membrane targeted Cerulean CFP-expressing clones.

After confocal imaging, clones were visualized in 3D in Imaris. Cell-type composition was scored on the basis of cell morphology. For some clones, cell-type identity could not be definitively assigned to all cells in the clone, especially if cell boundaries could not be determined. In these cases, the number of cells of a given type that could be identified is given as a lower limit in Extended Data Fig. 6e. (For example, in clone 28, which contains 40 cells, there are at least 4 aCap and 30 gCap cells.)

Detection of capillary cell proliferation by EdU labelling

The synthetic deoxyribonucleoside analogue EdU (Carbosynth, NE08701) was administered to adult Apln-creER; Rosa26-tdTomato and Aplnr-creER; Rosa26-tdTomato mice in drinking water at 0.2 mg/ml 3 weeks after tamoxifen injection (two doses of 4 mg administered by i.p. injection 48 h apart) for 6 weeks. Lungs were collected as described, fixed in 4% PFA in PBS for 5 h at 4 °C and cut into 300-μm sections on a vibratome. EdU was detected using click chemistry to covalently attach Alexa Fluor 647 azide to EdU alkyne incorporated into DNA during the S phase of the cell cycle (Invitrogen, C10340; Click-iT EdU Alexa Fluor 647 Imaging Kit), by incubating vibratome sections in Click-iT reaction cocktail for 4 h at room temperature.

Alveolar injury with elastase

Elastase solution (0.8 μg/μl) was prepared by dissolving elastase (Worthington, LS002292) in PBS and stored at −20 °C. A single dose of elastase (40 μg) or PBS was delivered by intratracheal instillation into the lungs of avertin (1.2% in PBS)-anaesthetized adult Apln-creER; Rosa26-tdTomato and Aplnr-creER; Rosa26-tdTomato mice 3 weeks after tamoxifen injection (two doses of 4 mg administered by i.p. injection 48 h apart) to allow tamoxifen clearance. EdU was administered as described above starting the night before elastase injury for three days, one week or six weeks. Lungs were collected as described, fixed in 4% PFA in PBS for 5 h at 4 °C and cut into 300-μm sections on a vibratome. EdU was detected using click chemistry as described above. Sections were incubated in Alexa Fluor 488 hydrazide (Invitrogen, A10436; reconstituted to 0.5 mg/ml in PBS, used at 1:100) and DAPI (2 μg/ml) in 0.1% Triton-X-100 in PBS overnight at 4 °C. After washing, sections were cleared and imaged in CUBIC131. EdU incorporation was analysed in lineage-labelled cells in Apln-creER; Rosa26-tdTomato lungs (3 days, control: n = 2 mice, 296 and 297 cells counted; 1 week, control: n = 2 mice, 404 and 295 cells counted; 6 weeks, control: n = 2 mice, 424 and 3,977 cells counted; 3 days, elastase: n = 2 mice, 767 and 1,018 cells counted; 1 week, elastase: n = 4 mice, 431, 877, 253 and 223 cells counted; 6 weeks, elastase: n = 3 mice, 1252, 420 and 1,054 cells counted) and in Aplnr-creER; Rosa26-tdTomato lungs (3 days, control: n = 2 mice, 1,416 and 1,569 cells counted; 1 week, control: n = 2 mice, 1,286 and 1,129 cells counted; 6 weeks, control: n = 2 mice, 1,172 and 3,390 cells counted; 3 days, elastase: n = 3 mice, 1,286, 810 and 1,267 cells counted; 1 week, elastase: n = 3 mice, 844, 623 and 707 cells counted; 6 weeks, elastase: n = 2 mice, 967 and 1,246 cells counted). In elastase-treated lungs, only cells in injured regions were scored.

Acquisition and processing of images

The image of the immunostained embryonic lung in Fig. 4a was captured using a Zeiss Axioskop microscope and the MRC-1000 Laser Scanning Confocal Imaging System (Bio-Rad) and processed in ImageJ (NIH). All other fluorescent samples were imaged on an LSM 780 (Carl Zeiss Microscopy) or LSM 880 equipped with Airyscan (Carl Zeiss Microscopy) confocal microscope, and images were processed in Zen (Carl Zeiss Microscopy) or Imaris (Bitplane) software. Two-dimensional (2D) confocal images presented are maximum intensity projections of z-stacks. H&E stains were imaged on a slide scanning system (Philips), and images were processed using QuPath software32.

Analysis of scRNA-seq data

Processed scRNA-seq Smart-Seq2 data for adult (3-month-old) mouse lung were obtained from the Tabula Muris resource13 (https://tabula-muris.ds.czbiohub.org) as gene count tables with de-multiplexed and aligned reads. Cells with fewer than 500 detected genes or 50,000 reads were excluded. Data were log-transformed: ln(CPM + 1). Expression profiles of cells were clustered using the R software package Seurat33 (v.2.3). Highly variable genes were selected using the ‘FindVariableGenes’ function (dispersion (mean/variance) z-score > 0.5) for linear dimensionality reduction using principal component analysis. The number of principal components was selected by inspection of the plot of variance explained. Cells were clustered by constructing a shared nearest neighbour graph and clusters were visualized by t-SNE. Lung endothelial cells (n = 693 cells, Fig. 1b, c, Fig. 3, Extended Data Fig. 1a, b, Extended Data Fig. 3h, Extended Data Fig. 4d, e, Extended Data Fig. 10a) were identified by Cldn5, Pecam1 and Cdh5 expression and subclustered. Artery (n = 76 cells), vein (n = 54 cells) and lymphatics (n = 68 cells) clusters were annotated using canonical markers (Gja5 and Bmx for arteries; Nr2f2 for veins; Vwf for arteries and veins; Pdpn and Prox1 for lymphatics). For annotation of the remaining clusters (aCap, n = 101 cells; gCap, n = 394 cells), cluster markers were identified by differential expression analysis using the Wilcoxon rank sum test with Bonferroni correction (P < 0.01) as implemented in the ‘FindMarkers’ function in Seurat, and specific cluster markers were selected for localization of the cells by smFISH (Apln or Ednrb for aCap, Aplnr or Ptprb for gCap). The apparent separation of the cluster we annotated as gCap into two populations (see t-SNE plot in Fig. 1b) is probably due to batch effects, on the basis of the correlation of gene expression differences with technical differences, and was not analysed further.

Processed scRNA-seq MARS-Seq data for embryonic (E12.5–E19.5) and postnatal (P0, P1, P7 and 2-month-old) mouse lung from a previous study34 were obtained from the Gene Expression Omnibus (GEO) (accession number GSE119228). Cells with fewer than 500 unique molecular identifiers were discarded. Cells from adult lung processed differently than tissue from embryonic and postnatal time points were also eliminated. Data were log-transformed: ln(UP10K + 1) and analysed with Seurat as described above. Cells were initially clustered separately at each developmental stage and endothelial cells were identified by Cldn5, Pecam1 and Cdh5 expression. Annotated endothelial cells were combined from all time points (n = 4,378 cells) and clustered again. Contaminant haematopoietic cells were eliminated by filtering out Ptprc-expressing cells (log-transformed expression levels > 0.5). Artery, vein and lymphatics clusters were identified using canonical markers (Gja5 and Bmx for arteries; Nr2f2 for veins; Vwf for arteries and veins; Pdpn and Prox1 for lymphatics) following correction for cell cycle as described previously35 and eliminated from the analysis, resulting in a dataset of 3,094 plexus and capillary cells (64 cells at E12.5, 404 cells at E16.5, 296 cells at E18.5, 117 cells at E19.5, 1,016 cells at P0, 426 cells at P1, 438 cells at P7 and 333 cells at 2 months (adult); see Extended Data Figs. 6h–j, 7, Supplementary Data 24).

Single-cell trajectories were constructed for plexus and capillary cells (n = 3,094) using Monocle236. Mature aCap and gCap markers, identified by differential gene expression analysis in Seurat, were used for ordering of the cells, which produced a tree-shaped branched developmental trajectory (Extended Data Fig. 6h), with plexus cells located along the stem before the branchpoint, mature aCap cells at the tip of one branch and mature gCap cells at the tip of the second branch. Genes with branch-dependent expression were identified by branched expression analysis modelling (BEAM) (n = 1,119 genes; q-value < 0.05), as implemented in the ‘BEAM’ function in Monocle236. Genes that vary as a function of pseudotime were identified by differential expression analysis (n = 3,734 genes; q-value < 0.05), as implemented in the ‘differentialGeneTest’ function in Monocle2. The genes identified by the two approaches (n = 4,129) were clustered hierarchically and plotted as a branched heat map (Extended Data Fig. 7b, Supplementary Data 4) to visualize groups of genes that co-vary across pseudotime.

Processed scRNA-seq Smart-Seq2 data for adult (3-month-old) mouse heart and brain were obtained from the Tabula Muris resource13 as Seurat objects with annotated clusters. Brain and heart endothelial cells were identified by Cldn5, Pecam1, Esam and Cdh5 expression and subclustered. Artery, vein and lymphatics clusters were identified using canonical markers (Gja5, Bmx and Vegfc for arteries; Vwf for arteries and veins; Il1r1 for veins; Pdpn and Prox1 for lymphatics)37 and excluded from the analysis. The remaining cells were annotated as capillaries based on expression of Car438,39.

Processed scRNA-seq droplet (10X) data for adult (1-, 3-, 18-, 21- and 30-month-old) mouse lung, kidney and mammary gland were obtained from the Tabula Muris Senis resource40 (https://tabula-muris-senis.ds.czbiohub.org). Scanpy single-cell objects were imported in R and analysed with Seurat as described above. Cells with fewer than 500 detected genes or 1,000 unique molecular identifiers were discarded. Kidney, mammary gland and lung endothelial cells were identified by Pecam1, Cdh5 and Esam (for kidney and mammary gland) or Cldn5 (for lung) expression and subclustered. Contaminant haematopoietic cells were eliminated by filtering out Ptprc-expressing cells (log-transformed expression levels > 0.5). Contaminant stromal and epithelial cells were eliminated in the mammary gland dataset by filtering out cells that express Col1a1, Pdgfrb or Epcam. Artery, vein and lymphatics clusters were identified using canonical markers as above and excluded. Lung capillary cells were annotated using aerocyte (Apln, Car4, Ednrb, Tbx2) and gCap markers (Aplnr, Gpihbp1, Lpl). Clusters with Ehd3, Sost and Meg3 expression were annotated as glomerular capillaries in the kidney dataset. Clusters with expression of Gpihbp1, Rbp7 and Car4, and little or no expression of Vwf, were annotated as capillaries in the mammary gland dataset.

Alveolar aCap and gCap signature scores (defined as the sum of normalized and scaled gene expression values for the significant alveolar aCap or gCap marker genes: Bonferroni corrected P value < 0.01; average normalized expression fold change > 1; % expression > 40) were calculated for annotated lung (n = 495), heart (n = 753) and brain (n = 441) capillary endothelial cells in the Tabula Muris Smart-Seq2 data13 (Extended Data Fig. 3h), and for annotated lung (n = 2,050), glomerular (n = 126) and mammary gland (n = 173) capillary endothelial cells in the Tabula Muris Senis droplet data40 (Extended Data Fig. 3i).

Smart-Seq2 and droplet (10X) data for adult human lung (patient 1, 75-year-old man)16 were analysed as described above. Endothelial cells were identified by CLDN5 expression, subclustered and annotated using markers (GJA5 for arteries, ACKR1 for vein, PDPN for lymphatics, EDNRB for aerocytes, EDN1 for gCap cells, COL15A1 for bronchial vessels; droplet: n = 1,497 cells; 211 artery cells, 154 vein cells, 33 lymphatic cells, 230 bronchial endothelial cells, 315 aerocytes, 554 gCap cells; Fig. 4k, m–o, Extended Data Figs. 8b, c, 10; Smart-Seq2, n = 599 cells, Extended Data Fig. 10).

To identify differences in cell-type-specific expression between mouse and human, we first identified genes that are differentially expressed (P < 0.01, Wilcoxon rank sum test with Bonferroni correction) between the alveolar capillary cell types in each species using scRNA-seq Smart-Seq2 data for adult mouse lung endothelial cells obtained from the Tabula Muris resource13 and Smart-Seq2 data for adult human lung endothelial cells (patient 1, 75-year-old man)16. Lists of differentially expressed genes (see Supplementary Tables 2, 3) were then compared to identify genes specifically expressed in the same cell type in the two species (type 0), genes specifically expressed in one cell type in one species but not the other (type 1), or genes specifically expressed in different cell types in the two species (type 2). Selected genes of each type are shown in Fig. 4m–o, Extended Data Fig. 10c–e. A complete analysis of the evolutionary changes between alveolar capillary cell types in mouse versus human lungs is presented in supplementary table 7 in ref. 16.

Statistics and reproducibility

Data analysis and statistical tests were performed using R software (v.3.5.1). Data are represented as mean ± standard deviation (s.d.) for sample sizes larger than two. For comparison of two groups, a two-sided Wilcoxon rank sum test was conducted at 5% significance level. For Fig. 1a, the image is representative of n = 10 mice; for Fig. 1d, smFISH was performed on samples from n = 5 mice; for Fig. 1e, smFISH was performed on samples from n = 2 mice; for Fig. 1f, smFISH was performed on samples from n = 5 mice; for Fig. 1h, smFISH was performed on samples from two lobes of n = 1 mouse; for Fig. 1i, smFISH was performed on samples from two lobes of n = 1 mouse; for Fig. 2a, the image is representative of n = 5 mice; for Fig. 2b, the image is representative of n = 5 mice; for Fig. 2d, the image is representative of n = 2 mice; for Fig. 2f–i, each image is representative of n = 2 mice; for Fig. 2l, images are representative of n = 3 mice; for Fig. 2m, images are representative of n = 2 mice; for Fig. 2o, smFISH was performed on samples from n = 2 mice; for Fig. 4a, the image is representative of n = 10 embryos; for Fig. 4b, the image is representative of n = 2 embryos; for Fig. 4c, images are representative of n = 2 mice; for Fig. 4d, smFISH was performed on samples from n = 2 mice; for Fig. 4e, clones were examined in n = 2 mice, see clone table in Extended Data Fig. 6e; for Fig. 4g, images are representative of n = 2 mice at each time point; for Fig. 4j, smFISH was performed on samples from n = 3 humans; for Fig. 4l, smFISH was performed on samples from a single adenocarcinoma; for Fig. 4q, the image is representative of multiple lung regions from a single alligator; for Fig. 4r, smFISH was performed on samples from n = 2 alligators (1 juvenile; 1 adult); for Extended Data Fig. 1c, smFISH was performed on samples from n = 5 mice; for Extended Data Fig. 1d, smFISH was performed on samples from n = 2 mice; for Extended Data Fig. 1e, e’, smFISH was performed on samples from n = 2 mice; for Extended Data Fig. 1f, g, smFISH was performed on samples from n = 1 mouse; for Extended Data Fig. 1j, smFISH was performed on samples from n = 2 mice at each age; for Extended Data Fig. 1k, l, smFISH was performed on samples from two lobes of n = 1 mouse of each genotype; for Extended Data Fig. 2a–d; images are representative of n = 2 embryos or mouse pups at each time point; for Extended Data Fig. 2e, f, images are representative of n = 5 mice; for Extended Data Fig. 3a, the image is representative of n = 2 mice; for Extended Data Fig. 3b–f, images are representative of n = 4 mice; for Extended Data Fig. 3g, the image is representative of n = 2 mice; for Extended Data Fig. 3j–l, smFISH was performed on samples from n = 2 mice; for Extended Data Fig. 4a–c, images are representative of n = 2 mice of each genotype; for Extended Data Fig. 5a, the images are representative of n = 2 mice at each time point; for Extended Data Fig. 5c, the images are representative of n = 2 mice of each genotype; for Extended Data Fig. 5d, the images are representative of n = 2 Apln-creER; Rosa26-tdTomato mice and n = 3 Aplnr-creER; Rosa26-tdTomato mice; for Extended Data Fig. 5e, the images are representative of n = 4 Apln-creER; Rosa26-tdTomato mice and n = 3 Aplnr-creER; Rosa26-tdTomato mice; for Extended Data Fig. 5f, the images are representative of n = 3 Apln-creER; Rosa26-tdTomato mice and n = 2 Aplnr-creER; Rosa26-tdTomato mice; for Extended Data Fig. 6a, b, smFISH was performed on samples from n = 2 embryos; for Extended Data Fig. 6c, smFISH was performed on samples from two lobes of n = 1 mouse; for Extended Data Fig. 6d, e, clones were examined in n = 2 mice; for Extended Data Fig. 6f, f’, f’’, smFISH was performed on samples from two lobes of n = 1 mouse; for Extended Data Fig. 6g, smFISH was performed on samples from n = 3 embryos; for Extended Data Fig. 6k, l, smFISH was performed on samples from n = 3 mice at each age; for Extended Data Fig. 8a, the images are representative of n = 5 human lungs; for Extended Data Fig. 8d–h, smFISH was performed on samples from n = 3 humans; for Extended Data Fig. 8j, j’, histology and smFISH were performed on samples from a single human fetal lung from each time point (17 weeks and 23 weeks; see figure legend); for Extended Data Fig. 9a, a’, a’’, immunostaining and smFISH were performed on samples from a single adenocarcinoma; for Extended Data Fig. 9b, b’, c, c’, f, smFISH was performed on samples from n = 2 mice; for Extended Data Fig. 11c, d, images are representative of multiple regions from a single alligator lung; for Extended Data Fig. 11e, f, smFISH was performed on samples from n = 2 alligators (1 juvenile, 1 adult); for Extended Data Fig. 11g, the image is representative of multiple regions from a single alligator lung; for Extended Data Fig. 11i, the image is representative of multiple regions from a single turtle lung; for Extended Data Fig. 11j, k, images are representative of n = 2 turtles; for Extended Data Fig. 11l, l’, smFISH was performed on samples from n = 2 turtles; for Extended Data Fig. 11m, the image is representative of n = 2 turtles; for Supplementary Data 1, images are representative of n = 2 mice. For all graphs, the number of biologically independent samples is reported in the legend, or in the ‘Alveolar injury with elastase’ (for Fig. 2n) or ‘Sparse labelling of endothelial cells and analysis of cell morphology’ (for Fig. 2c, 4h, Extended Data Fig. 2g, h) sections of the Methods. Sample size calculations were not performed. Mice of the appropriate genotype or age were allocated into experimental groups (control versus elastase injury) at random. The investigators were not blinded to sample allocation.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.