function : extract.noun()
1. 데이터를 입력받는다
2. 초록을 추출한다.
3. 명사를 추출한다.
4. 명사형태의 키워드 리스트를 반환한다.
function2 : edge.maker()
1. 데이터를 입력받는다
2. 초록의 명사키워드 리스트를 받아서 2개씩 Pair를 생성해준다(Combination 연산을 진행한다)
3. Combination 연산의 결과를 세로로 배치한다 'Source'와 'Target'형태로 만들어 준다
4. Source와 Target을 합산하여 Weight를 산출한다
5. 결과를 리턴한다
혹은
Document-Term Matrix를 생성한다 -> Term-Term Matrix를 생성한다 -> Edge-list를 생성한다
1. 코드를 최대한 모듈화 할 수 있도록 코드를 작성해주세요.
2. 큰 규모의 함수를 돌리는 상황에서는 시작시간과 종료시간을 체크할 수 있도록 코드를 작성해 주세요.
3. 혹시 기존의 패키지와 함수들을 사용한다면 사용버전을 꼭 명시해주세요
#함수선언부분
extract.noun = function(){
#함수기능 수행
}
#함수선언부분
edge.maker = function(){
#함수기능 수행
}
#코드실행부분
keyword = pd.read_csv()
head(keyword)
noun.list = extract.noun(keyword)
edgelist = edge.maker(noun.list)
head(edgelist)
head(keyword)
IDX | TI | AB |
---|---|---|
1 | A computational model for the endogenous arousal of thoughts through Z*-numbers | Natural language provides a rich combinatorial mechanism for encoding meanings - a finite set of words can express an unbounded number of thoughts. Framed in 2015 to extend the purpose of Zadeh’s Z-numbers a Z*-number is a perceptual symbol of the meaning of a natural language expression and consequently mentalese or internal speech. This article through decomposition of the Z(x)-macro-parameters into its atomic constituents presents a model for the endogenous arousal of thoughts during empathetic bespoke comprehension of the real-world. Based on Minsky’s Society of Mind the framework is founded on the assimilation of multimodal experiences a sense of ’unified self and its derivatives (choice interest curiosityetc.) objective and subjective components of knowledge commonsense and attention dynamics over a real-world scenario. The model attempts emulation of slow and fast thinking instinctive reactions learning deliberation reflection and self-conscious decisions. The design has been validated against human responses and aims to contribute to the development of autonomous artificial systems for man-machine symbiosis. (C) 2017 Elsevier Inc. All rights reserved. |
2 | Mapping vulnerability of multiple aquifers using multiple models and fuzzy logic to objectively derive model structures | Driven by contamination risks mapping Vulnerability Indices (VI) of multiple aquifers (both unconfined and confined) is investigated by integrating the basic DRASTIC framework with multiple models overarched by Artificial Neural Networks (ANN). The DRASTIC framework is a proactive tool to assess VI values using the data from the hydrosphere lithosphere and anthroposphere. However a research case arises for the application of multiple models on the ground of poor determination coefficients between the VI values and non-point anthropogenic contaminants. The paper formulates SCFL models which are derived from the multiple model philosophy of Supervised Committee (SC) machines and Fuzzy Logic (FL) and hence SCFL as their integration. The Fuzzy Logic based (FL) models include: Sugeno Fuzzy Logic (SFL) Mamdani Fuzzy Logic (MFL) Larsen Fuzzy Logic (LFL) models. The basic DRASTIC framework uses prescribed rating and weighting values based on expert judgment but the four FL-based models (SFL MFL LFL and SCFL) derive their values as per internal strategy within these models. The paper reports that FL and multiple models improve considerably on the correlation between the modeled vulnerability indices and observed nitrate-N values and as such it provides evidence that the SCFL multiple models can be an alternative to the basic framework even for multiple aquifers. The study area with multiple aquifers is in Varzeqan plain East Azerbaijan northwest Iran. (C) 2017 Elsevier B.V. All rights reserved. |
3 | A review of affective computing: From unimodal analysis to multimodal fusion | Affective computing is an emerging interdisciplinary research field bringing together researchers and practitioners from various fields ranging from artificial intelligence natural language processing to cognitive and social sciences. With the proliferation of videos posted online (e.g. on YouTube Facebook Twitter) for product reviews movie reviews political views and more affective computing research has increasingly evolved from conventional unimodal analysis to more complex forms of multimodal analysis. This is the primary motivation behind our first of its kind comprehensive literature review of the diverse field of affective computing. Furthermore existing literature surveys lack a detailed discussion of state of the art in multimodal affect analysis frameworks which this review aims to address. Multimodality is defined by the presence of more than one modality or channel e.g. visual audio text gestures and eye gage. In this paper we focus mainly on the use of audio visual and text information for multimodal affect analysis since around 90% of the relevant literature appears to cover these three modalities. Following an overview of different techniques for unimodal affect analysis we outline existing methods for fusing information from different modalities. As part of this review we carry out an extensive study of different categories of state-of-the-art fusion techniques followed by a critical analysis of potential performance improvements with multimodal analysis compared to unimodal analysis. A comprehensive overview of these two complementary fields aims to form the building blocks for readers to better understand this challenging and exciting research field. (C) 2017 Elsevier B.V. All rights reserved. |
4 | Stock market one-day ahead movement prediction using disparate data sources | There are several commercial financial expert systems that can be used for trading on the stock exchange. However their predictions are somewhat limited since they primarily rely on time-series analysis of the market. With the rise of the Internet new forms of collective intelligence (e.g. Google and Wikipedia) have emerged representing a new generation of “crowd-sourced” knowledge bases. They collate information on publicly traded companies while capturing web traffic statistics that reflect the public’s collective interest. Google and Wikipedia have become important “knowledge bases” for investors. In this research we hypothesize that combining disparate online data sources with traditional time-series and technical indicators for a stock can provide a more effective and intelligent daily trading expert system. Three machine learning models decision trees neural networks and support vector machines serve as the basis for our “inference engine”. To evaluate the performance of our expert system we present a case study based on the AAPL (Apple NASDAQ) stock. Our expert system had an 85% accuracy in predicting the next-day AAPL stock movement which outperforms the reported rates in the literature. Our results suggest that: (a) the knowledge base of financial expert systems can benefit from data captured from nontraditional “experts” like Google and Wikipedia; (b) diversifying the knowledge base by combining data from disparate sources can help improve the performance of financial expert systems; and (c) the use of simple machine learning models for inference and rule generation is appropriate with our rich knowledge database. Finally an intelligent decision making tool is provided to assist investors in making trading decisions on any stock commodity or index. (c) 2017 Elsevier Ltd. All rights reserved. |
5 | A sub-space artificial neural network for mold cooling in injection molding | The applications of artificial intelligence (AI) have considerably expanded over recent years. A new class of industrial systems is beginning to evolve that incorporates using high volume data and advanced analytics to better optimize product quality while reducing energy consumption. Artificial neural networks (ANN) when combined with advanced modeling and control begins to form an AI platform that can be further enhanced for factories of the future. This paper provides a demonstration of such initial work that can be further developed for future systems in a generic way. When considering polymer processing such as plastic injection molding the mold cavity temperature (MCT) profile directly relates to part quality and part reject rates. Therefore it is desirable to optimize the mold cooling process using real time control of MCT as it directly affect part quality. However MCT is affected by a number of interacting nonlinear dynamic parameters that are often neglected due to the challenge of quantifying such parameters. Advanced model based control algorithms are often used for providing improved control of complex systems. However they depend on good model formulations that are analytically insufficient. An online intelligent system identification approach for the mold cooling process is developed and tested. An ANN is designed to adjust online sub-space parameters that govern a mold cooling model. Results demonstrate that this online ANN approach can be used to accurately predict the dynamic behavior of mold cavity surface temperature. This is key to many industrial systems where their states are not directly observable and uncertainties are unknown. The methodology can be readily adapted for different operating conditions as in this case of polymer processing and has good potential for its integration with advanced model based control schemes and cloud computing approaches for the next generation of machines. (C) 2017 Elsevier Ltd. All rights reserved. |
6 | Automatic Density Peaks Clustering Using DNA Genetic Algorithm Optimized Data Field and Gaussian Process | Clustering by fast search and finding of Density Peaks ( called as DPC) introduced by Alex Rodriguez and Alessandro Laio attracted much attention in the field of pattern recognition and artificial intelligence. However DPC still has a lot of defects that are not resolved. Firstly the local density rho(i) of point i is affected by the cutoff distance dc which can influence the clustering result especially for small real-world cases. Secondly the number of clusters is still found intuitively by using the decision diagram to select the cluster centers. In order to overcome these defects this paper proposes an automatic density peaks clustering approach using DNA genetic algorithm optimized data field and Gaussian process (referred to as ADPC-DNAGA). ADPC-DNAGA can extract the optimal value of threshold with the potential entropy of data field and automatically determine the cluster centers by Gaussian method. For any data set to be clustered the threshold can be calculated from the data set objectively rather than the empirical estimation. The proposed clustering algorithm is benchmarked on publicly available synthetic and real-world datasets which are commonly used for testing the performance of clustering algorithms. The clustering results are compared not only with that of DPC but also with that of several well-known clustering algorithms such as Affinity Propagation DBSCAN and Spectral Cluster. The experimental results demonstrate that our proposed clustering algorithm can find the optimal cutoff distance d(c) to automatically identify clusters regardless of their shape and dimension of the embedded space and can often outperform the comparisons. |
head(edgelist)
Source | Target | Type | Weight | Source_Label | Target_Label |
---|---|---|---|---|---|
1350 | 1218 | Undirected | 518 | artificial neural network | artificial intelligence |
7587 | 1218 | Undirected | 305 | expert system | artificial intelligence |
9049 | 1350 | Undirected | 214 | genetic algorithm | artificial neural network |
9049 | 1218 | Undirected | 185 | genetic algorithm | artificial intelligence |
8595 | 1218 | Undirected | 177 | fuzzy | artificial intelligence |
12951 | 1218 | Undirected | 171 | machine learning | artificial intelligence |