Probabilistic topic modeling algorithms like Latent Dirichlet Allocation (LDA) have become powerful tools for the analysis of large collections of documents (such as papers, projects, or funding applications) in science, technology an innovation (STI) policy design and monitoring. However, selecting an appropriate and stable topic model for a specific application (by adjusting the hyperparameters of the algorithm) is not a trivial problem. Common validation metrics like coherence or perplexity, which are focused on the quality of topics, are not a good fit in applications where the quality of the document similarity relations inferred from the topic model is especially relevant. Relying on graph analysis techniques, the aim of our work is to state a new methodology for the selection of hyperparameters which is specifically oriented to optimize the similarity metrics emanating from the topic model. In order to do this, we propose two graph metrics: the first measures the variability of the similarity graphs that result from different runs of the algorithm for a fixed value of the hyperparameters, while the second metric measures the alignment between the graph derived from the LDA model and another obtained using metadata available for the corresponding corpus. Through experiments on various corpora related to STI, it is shown that the proposed metrics provide relevant indicators to select the number of topics and build persistent topic models that are consistent with the metadata. Their use, which can be extended to other topic models beyond LDA, could facilitate the systematic adoption of this kind of techniques in STI policy analysis and design.
Number entry is a ubiquitous activity and is often performed in safety- and mission-critical procedures, such as healthcare, science, finance, aviation and in many other areas. We show that Monte Carlo methods can quickly and easily compare the reliability of different number entry systems. A surprising finding is that many common, widely used systems are defective, and induce unnecessary human error. We show that Monte Carlo methods enable designers to explore the implications of normal and unexpected operator behaviour, and to design systems to be more resilient to use error. We demonstrate novel designs with improved resilience, implying that the common problems identified and the errors they induce are avoidable. 相似文献
The diagnosis and treatment of prostate cancer (PCa) is a major health-care concern worldwide. This cancer can manifest itself in many distinct forms and the transition from clinically indolent PCa to the more invasive aggressive form remains poorly understood. It is now universally accepted that glycan expression patterns change with the cellular modifications that accompany the onset of tumorigenesis. The aim of this study was to investigate if differential glycosylation patterns could distinguish between indolent, significant, and aggressive PCa. Whole serum N-glycan profiling was carried out on 117 prostate cancer patients’ serum using our automated, high-throughput analysis platform for glycan-profiling which utilizes ultra-performance liquid chromatography (UPLC) to obtain high resolution separation of N-linked glycans released from the serum glycoproteins. We observed increases in hybrid, oligomannose, and biantennary digalactosylated monosialylated glycans (M5A1G1S1, M8, and A2G2S1), bisecting glycans (A2B, A2(6)BG1) and monoantennary glycans (A1), and decreases in triantennary trigalactosylated trisialylated glycans with and without core fucose (A3G3S3 and FA3G3S3) with PCa progression from indolent through significant and aggressive disease. These changes give us an insight into the disease pathogenesis and identify potential biomarkers for monitoring the PCa progression, however these need further confirmation studies. 相似文献
Journal of Mechanical Science and Technology - This study delivers equations useful for low-height pleated fibrous filter design: two pressure drop equations and one set of optimum design equations... 相似文献
Combined photochemical arylation, “nuisance effect” (SNAr) reaction sequences have been employed in the design of small arrays for immediate deployment in medium-throughput X-ray protein–ligand structure determination. Reactions were deliberately allowed to run “out of control” in terms of selectivity; for example the ortho-arylation of 2-phenylpyridine gave five products resulting from mono- and bisarylations combined with SNAr processes. As a result, a number of crystallographic hits against NUDT7, a key peroxisomal CoA ester hydrolase, have been identified. 相似文献
To quantify the evacuation process, evacuation practitioners use engineering egress data describing the occupant movement characteristics. These data are typically based to young and fit populations. However, the movement abilities of occupants who might be involved in evacuations are becoming more variable—with the building populations of today typically including increasing numbers of individuals: with impairments or who are otherwise elderly or generally less mobile. Thus, there will be an increasing proportion of building occupants with reduced ability to egress. For safe evacuation, there is therefore a need to provide valid engineering egress data considering pedestrians with disabilities. Gwynne and Boyce recently compiled a series of data sets related to the evacuation process to support practitioner activities in the chapter Engineering Data in the SFPE Handbook of Fire Protection Engineering. This paper supplements these data sets by providing information on and presenting data obtained from additional research related to the premovement and horizontal movement of participants with physical‐, cognitive‐, or age‐related disabilities. The aim is to provide an overview of currently available data sets related to, and key factors affecting the egress performance of, mixed ability populations which could be used to guide fire safety engineering decisions in the context of building design. 相似文献