首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
This research adopts a framework that synthesizes Knowledge Discovery in Database (KDD), Cross Industry Standard Process for Data Mining (CRISP-DM), and agile practices. The application of this framework is demonstrated through an institutional case study of three knowledge discovery projects: Persistence, Retention, and Donor projects. Results from the case study suggest that (a) interaction and iteration are foundations for the success of a knowledge discovery project, especially one with a strong business focus; (b) agile practices facilitate the interaction and iteration nature of a knowledge discovery project; (c) adding business understanding and deployment steps from CRISP-DM to KDD explicitly helps data miners stay focused on the ultimate goals of the project—the needs of the business and the users.  相似文献   

2.
3.
The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard.  相似文献   

4.
The purpose of the current paper is to develop a theoretical model that identifies why people blog personal content and explains the effects of blogging in “real life.” Data from an online survey are analyzed using maximum likelihood procedures in LISREL 8.75 to test the structural model. Among 531 respondents from Cyworld, a popular social network and blogging site in South Korea, a randomly selected group of 251 users was used to develop the model. The other group of 280 users was used to confirm the usefulness of the revised model. Results (N = 251; N = 280) showed that impression management and voyeuristic surveillance are two major psychological factors that motivate individuals to post and read messages on personal blogs. Results also showed evidence for blogging’s real life consequences, measured by users’ perceived social support, loneliness, belonging, and subjective well-being.  相似文献   

5.
为了解决电信行业客户流失预测模型中流失者和未流失者比例偏斜问题,模型依据数据挖掘原理,以CRISP-DM(Cross-industry Standard Process for Data Mining)建模过程为框架,采用了多基决策树联合决策的思想。模型避免了训练出一棵“空”决策树,把所有客户都预测为未流失的问题。与单个分类器相比,提高了预测模型的查准率和泛化能力。  相似文献   

6.
基于CRISP-DM(cross-industry standard process for data mining)模型设计与实现了一个时序预测Web服务,对网站资源的下载需求量进行预测。重点阐述了CRISP-DM模型应用于时序预测任务时的设计思想和实现的关键技术。测试结果表明,该时序预测Web服务具有较高的预测准确率,部署快速,使用方便,对解决同类问题具有一定的示范和参考价值。  相似文献   

7.
Group coordination is a crucial component for successful collaborative learning, but is hard to achieve in an online learning environment. A web-based group coordination tool was developed based on metacognitive scaffolding principles for the study. The tool was implemented in an online course for a group project and its effects were investigated. A total of 59 students formed into 20 groups participated in and completed a project while being guided with the tool. Based on response rate to metacognitive prompts of the tool, groups were categorized as Active Metacognitive Team (AMT, n = 30) or Passive Metacognitive Team (PMT, n = 29). AMT showed higher positive interdependence than PMT at the end of the project. AMT perceived reciprocal help among group members while PMT did not. AMT also evaluated its group process higher than PMT did. These results show that groups who more actively used the coordination support tool established positive interdependence, engaged in positive interactions, and had enhanced group productivity.  相似文献   

8.
鲁钊  陈世平 《计算机应用》2011,31(11):3087-3090
针对机械制造业中质量管理不规范、决策效率偏低问题,以典型的机械制造企业为切入点,运用ID3决策树算法,以数据挖掘跨行业标准过程(CRISP-DM)对其质量管理信息进行数据挖掘。利用基于信息增益率的计算分类技术,生成了决策树模型,并将该模型在企业资源计划(ERP)中进行了初步实现。通过测试分析,该模型能有效提高管理决策效率,规范处理流程。  相似文献   

9.
This paper explores teacher beliefs that influence the ways Information and Communications Technologies (ICT) are used in learning contexts. Much has been written about the impact of teachers’ beliefs and attitudes to ICT as ‘barriers’ to ICT integration ( [Ertmer et?al., 2007], [Higgins and Moseley, 2001] and [Loveless, 2003]). This paper takes a closer look at the types of beliefs that influence ICT practices in classrooms and the alignment of these beliefs to current pedagogical reform in Australia. The paper draws on data collected through the initial phase of a research project that involved an Industry Collaborative of four Catholic primary schools (prep - grade 7). Data are drawn from teacher surveys, interviews and document analysis. The results present specific links between ICT beliefs that are informing teachers’ practices. ICT beliefs and practices are aligned to reform agenda for digital pedagogies. The findings of this research inform teacher ICT practice and requirements for ICT professional development.  相似文献   

10.
Dynamic visual acuity (DVA) thresholds are among the few visual functions predictive of automobile crashes. DVA is also sensitive to alcohol and aging. However, measuring DVA is awkward because there is no standardized, efficient, flexible apparatus for DVA assessment. In this project, we developed a prototype of an automated, portable DVA system using a low-energy laser, and we compared this laser DVA with the traditional device in two within-subjects, repeated measures designs. The two studies included 48 participants (22 males and 26 females with an average age of 18.33 years). The most important findings were that: (1) retest reliabilities of the two DVA devices were comparable and higher with the laser; (2) average correlations between the two devices were r = 0.62 (p < 0.01) and r = 0.65 (p < 0.01) for the two designs respectively; and (3) after correction for reliability attenuation these improved to r = 0.92 and r = 0.78. These findings indicate that a flexible DVA laser device can be developed to measure the same construct as the more traditional bulky DVA device.  相似文献   

11.
Effects of atmospheric variation on AVHRR NDVI data   总被引:1,自引:0,他引:1  
The AVHRR (Advanced Very High Resolution Radiometer) series of instruments has frequently been used for vegetation studies. The 25+ year record has enabled important time-series studies. Many applications use NDVI (Normalized Difference Vegetation Index), or derivatives of it, as their operational variable. However, most AVHRR datasets have incomplete atmospheric correction, because of which there is considerable, but largely unknown, uncertainty in the significance of differences in NDVI and other short wave observations from AVHRR instruments.The purpose of this study was to gain better understanding of the impact of incomplete or lack of atmospheric correction in widely-used, publicly available processed AVHRR-NDVI long-term datasets. This was accomplished by comparison with atmospherically corrected AVHRR data at AERONET (AErosol RObotic NETwork) sunphotometer sites in 1999. The datasets included in this study are: TOA (Top Of Atmosphere) that is with no atmospheric correction; PAL (Pathfinder AVHRR Land); and an early version of the new LTDR (Long Term Data Record) NDVI. The other publicly available datasets like GIMMS (Global Inventory Modeling and Mapping studies) and GVI (Global Vegetation Index) have atmospheric error budget similar to that of TOA, because no atmospheric correction is used in either processing stream. Of the three datasets, LTDR was found to have least errors (accuracy = 0.0064 to − 0.024, precision = 0.02 to 0.037 for clear and average atmospheric conditions) followed by PAL (accuracy = − 0.145 to − 0.035, precision = 0.0606 to 0.0418), and TOA (accuracy = − 0.0791 to − 0.112, precision = 0.0613 to 0.0684). It was also observed that temporal maximum value compositing technique does not cause significant improvement of precision in regions experiencing persistently high AOT (Aerosol Optical Thickness).  相似文献   

12.
We conducted a preliminary investigation of the response of ERS C-band SAR backscatter to variations in soil moisture and surface inundation in wetlands of interior Alaska. Data were collected from 5 wetlands over a three-week period in 2007. Results showed a positive correlation between backscatter and soil moisture in sites dominated by herbaceous vegetation cover (r = 0.74, p < 0.04). ERS SAR backscatter was negatively correlated to water depth in all open (non-forested) wetlands when water table levels were more than 6 cm above the wetland surface (r = − 0.82, p < 0.001). There was no relationship between backscatter and soil moisture in the forested (black spruce-dominated) wetland site. Our preliminary results show that ERS SAR data can be used to monitor variations in hydrologic conditions in high northern latitude wetlands (including peatlands), particularly sites with sparse tree cover.  相似文献   

13.
To reduce environment pollution from cropping activities, a reliable indicator of crop N status is needed for site-specific N management in agricultural fields. Nitrogen Nutrition Index (NNI) can be a valuable candidate, but its measurement relies on tedious sampling and laboratory analysis. This study proposes a new spectral index to estimate plant nitrogen (N) concentration, which is a critical component of NNI calculation. Hyperspectral reflectance data, covering bands from 325 to 1075 nm, were collected using a ground-based spectroradiometer on corn and wheat crops at different growth stages from 2005 to 2008. Data from 2006 to 2008 was used for new index development and the comparison of the new index with some existing indices. Data from 2005 was used to validate the best index for predicting plant N concentration. Additionally, a hyperspectral image of corn field in 2005 was acquired using an airborne Compact Airborne Spectrographic Imager (CASI), and the corresponding plant N concentration was obtained by conventional laboratory methods on selected area. These data were also used for validation. A new N index, named Double-peak Canopy Nitrogen Index (DCNI), was developed and compared to the existing indices that were used for N detection. In this study, DCNI was the best spectral index for predicting plant N concentration, with R2 values of 0.72 for corn, 0.44 for wheat, and 0.64 for both species combined, respectively. The validation using an independent ground-based spectral database of corn acquired in 2005, yielded an R2 value of 0.62 and a root-mean-square-error (RMSE) of 2.7 mg N g− 1 d.m. The validation using the CASI spectral information, DCNI calculation was related to actual corn N concentration with a R2 value of 0.51 and a RMSE value of 3.1 mg N g− 1 d.m. It is concluded that DCNI, in association with indices related to biomass, has a good potential for remote assessment of NNI.  相似文献   

14.
Land-cover mapping efforts within the USGS Gap Analysis Program have traditionally been state-centered; each state having the responsibility of implementing a project design for the geographic area within their state boundaries. The Southwest Regional Gap Analysis Project (SWReGAP) was the first formal GAP project designed at a regional, multi-state scale. The project area comprises the southwestern states of Arizona, Colorado, Nevada, New Mexico, and Utah. The land-cover map/dataset was generated using regionally consistent geospatial data (Landsat ETM+ imagery (1999-2001) and DEM derivatives), similar field data collection protocols, a standardized land-cover legend, and a common modeling approach (decision tree classifier). Partitioning of mapping responsibilities amongst the five collaborating states was organized around ecoregion-based “mapping zones”. Over the course of 21/2 field seasons approximately 93,000 reference samples were collected directly, or obtained from other contemporary projects, for the land-cover modeling effort. The final map was made public in 2004 and contains 125 land-cover classes. An internal validation of 85 of the classes, representing 91% of the land area was performed. Agreement between withheld samples and the validated dataset was 61% (KHAT = .60, n = 17,030). This paper presents an overview of the methodologies used to create the regional land-cover dataset and highlights issues associated with large-area mapping within a coordinated, multi-institutional management framework.  相似文献   

15.
In this work we develop some reflections on the thresholding algorithm proposed by Tizhoosh in [16]. The purpose of these reflections is to complete the considerations published recently in [17] and [18] on said algorithm. We also prove that under certain constructions, Tizhoosh's algorithm makes it possible to obtain additional information from commonly used fuzzy algorithms.  相似文献   

16.
The present experiment investigated if anthropomorphic interfaces facilitate people’s tendency to project social expectations onto computers and how such effects might vary depending on users’ cognitive style. In a 2 (synthetic vs. recorded speech) × 2 (flattering vs. generic feedback) × 2 (low vs. high rationality) × 2 (low vs. high experientiality) experiment, participants played a trivia game with a computer. Use of recorded speech did not amplify the previously documented flattery effects (Fogg & Nass, 1997), challenging the notion that anthropomorphism will promote social responses to computers. Participants evaluated the human-voiced computer more positively and conformed more to its suggestions than the one using synthetic speech, but such effects were found only among less analytical or more intuition-driven individuals, suggesting dispositional differences in people’s susceptibility to anthropomorphic cues embedded in the interface.  相似文献   

17.
Microdevices dedicated to monitor metabolite levels have recently enabled many applications in the field of cell analysis, to monitor cell growth and development of numerous cell lines. By combining the traditional technology used for electrochemical biosensors with nanoscale materials, it is possible to develop miniaturized metabolite biosensors with unique properties of sensitivity and detection limit. In particular, enzymes tend to adsorb onto carbon nanotubes and their optical or electrical activity can perturb the electronic properties. In the present work we propose multi-walled carbon nanotube-based biosensors to monitor a cell line highly sensitive to metabolic alterations, in order to evaluate lactate production and glucose uptake during different cell states. We achieve sensors for both lactate and glucose, with sensitivities of 40.1 μA mM−1 cm−2 and 27.7 μA mM−1 cm−2, and detection limits of 28 μM and 73 μM, respectively. This nano-biosensing technology is used to provide new information on cell line metabolism during proliferation and differentiation, which are unprecedented in cell biology.  相似文献   

18.
Sudden Oak Death is a new and virulent disease affecting hardwood forests in coastal California. The spatial-temporal dynamics of oak mortality at the landscape scale are crucial indicators of disease progression. Modeling disease spread requires accurate mapping of the dynamic pattern of oak mortality in time through multi-temporal image analysis. Traditional mapping approaches using per-pixel, single-date image classifications have not generated consistently satisfactory results. Incorporation of spatial-temporal contextual information can improve these results. In this paper, we propose a spatial-temporally explicit algorithm to classify individual images using the spectral and spatial-temporal information derived from multiple co-registered images. This algorithm is initialized by a spectral classification using Support Vector Machines (SVM) for each individual image. Then, a Markov Random Fields (MRF) model accounting for ecological compatibility is used to model the spatial-temporal contextual prior probabilities of images. Finally, an iterative algorithm, Iterative Conditional Mode (ICM), is used to update the classification based on the combination of the initial SVM spectral classifications and MRF spatial-temporal contextual model. The algorithm was applied to two-year (2000, 2001) ADAR (Airborne Data Acquisition and Registration) images, from which three classes (bare, dead, forest) are detected. The results showed that the proposed algorithm achieved significantly better results (Year 2000: Kappa = 0.92; Year 2001: Kappa = 0.91), compared to traditional pixel-based single-date approaches (Year 2000: Kappa = 0.67; Year 2001: Kappa = 0.66). The improvement from the contributions of spatial-temporal contextual information indicated the importance of spatial-temporal modeling in multi-temporal remote sensing in general and forest disease modeling in particular.  相似文献   

19.
Over the past 20 years self-report measures of healthcare students’ information and communication technology skills have been developed with limited validation. Furthermore, measures of student experience of e-learning emerged but were not repeatedly used with diverse populations. A psychometric approach with five phases was used to develop and test a new self-report measure of skills and experience with information and communication technology and attitudes to computers in education. Phase 1: Literature review and identification of key items. Phase 2: Development and refinement of items with expert panel (n = 16) and students (n = 3) to establish face and content validity. Phase 3: Pilot testing of draft instrument with graduate pre-registration nursing students (n = 60) to assess administration procedures and acceptability of the instrument. Phase 4: Test–retest with further sample of graduate pre-registration nursing students (n = 70) tested stability and internal consistency. Phase 5: Main study with pre-registration nursing students (n = 458), further testing of internal consistency. The instrument proved to have moderate test–retest stability and the sub-scales had acceptable internal consistency. When used with a larger, more diverse population the psychometric properties were more variable. Further work is needed to refine the instrument with specific reference to possible cultural and linguistic response patterns and technological advances.  相似文献   

20.
Previous research has identified the importance of social connectedness in facilitating a number of positive outcomes, however, investigation of connectedness in online contexts is relatively novel. This research aimed to investigate for the first time social connectedness derived from the use of Facebook. Study 1 investigated whether offline social connectedness and Facebook connectedness were separate constructs. Participants were Facebook users (N = 344) who completed measures of offline social connectedness and Facebook social connectedness. Factor analysis (Maximum Likelihood analysis with Oblimin rotation) revealed Facebook connectedness to be distinct from offline social connectedness. Study 2 examined the relationship between Facebook social connectedness and anxiety, depression, and subjective well-being in a second sample of Facebook users (N = 274) in a cross-sectional design. Results suggest that Facebook use may provide the opportunity to develop and maintain social connectedness in the online environment, and that Facebook connectedness is associated with lower depression and anxiety and greater satisfaction with life. Limitations and future directions are considered. It is concluded that Facebook may act as a separate social medium in which to develop and maintain relationships, providing an alternative social outlet associated with a range of positive psychological outcomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号