Attenuator Grammatical Dependency Parsing

PURPOSE

The following subsection will describe the Grammatical Dependency Parsing Process. Grammatical Dependency Analysis was used in order to determine those words that are grammatically related to Attenuators. The script was developed in Python therefore the following chunks represent the Python code in form of a comment.

############# PREPARATION #########################
#import os
#import numpy as np
#import pandas as pd
#import nltk
#import re
#import string
#import itertools
#import csv
#from pandas import ExcelWriter
#from nltk.tokenize import sent_tokenize
#nltk.download()

#os.chdir("/Users/lisaherzog/Google Drive/UM/Smart Services/Thesis/Thesis/Code/Feature Set4/Input")

Import Data

#1. Importing the dataset

#from pandas import ExcelWriter
#from pandas import ExcelFile

#Data = pd.read_excel('6. Attenuator Fragments.xlsx')

#Text =Data['Attenuator.Text'].tolist()

#Text =Data['Attenuator.Text'].tolist()

Tokenize Sentences

In the next step, the sentences are tokenized.

#Sent_Text = []

#for i in range(0,623):
    #txt = Text[i]
    #tokenized = sent_tokenize(txt)
    #Sent_Text.append(tokenized)

#Sent_Text[0:5]

Grammatical Parsing

As a next step, the Grammatical Dependency Parser is imported.

#from nltk.parse.stanford import StanfordDependencyParser
#path_to_jar = '/Users/lisaherzog/Google Drive/UM/Smart Services/Thesis/Thesis/Stanford Grammatical Parser/stanford-parser-full-2015-04-20/stanford-parser.jar'
#path_to_models_jar = '/Users/lisaherzog/Google Drive/UM/Smart Services/Thesis/Thesis/Stanford Grammatical Parser/stanford-parser-full-2015-04-20/stanford-parser-3.5.2-models.jar'
#dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)

Using the Grammatical Parser, the review text was processed.

#Reviews= []
#Length= len(Sent_Text)

#for i in range(0,Length):
    #Extract = Sent_Text[i]
    #Extract_Length=len(Extract)
    #dependencies = []
    #Reviews.append(dependencies)

    #for j in range(0,Extract_Length):
        #Sent_Extract = str(Extract[j])
        #result = dependency_parser.raw_parse(Sent_Extract)
        #dep = result.__next__()
        #resultList = list(dep.triples())
        #dependencies.append(resultList)

Finally, the tuplet structure was simplified.

#Reviews_Simplified = []

#for k in range (0,623):
    #Extract= Reviews [k][0]
    #Reviews_Simplified.append(Extract)