1.Watson Natural Language Understanding ๊ฒฝํ—˜ํ•ด ๋ณด๊ธฐ

์•„๋ž˜ ๋งํฌ์—์„œ Text ํ˜น์€ URL์„ ์ž…๋ ฅํ•˜๊ณ  Watson Natural Language Understanding ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•ด ๋ณธ๋‹ค.

(1) Watson Natural Language Understanding Demo์— ์ ‘์†ํ•ด ๋กœ๊ทธ์ธ์„ ํ•ฉ๋‹ˆ๋‹ค.

(2) ๋ถ„์„๋Œ€์ƒ์ด ๋˜๋Š” TEXT ํ˜น์€ URL์„ ๋„ฃ๊ณ  ์‹คํ–‰ํ•˜์—ฌ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•ด ๋ณธ๋‹ค. Watson NLU๋Š” ํ…์ŠคํŠธ๋ฅผ ๋ถ„์„ํ•˜์—ฌ Sentiment, Entity, Concept, Keywords, Semantic Roles, Categories์˜ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•œ๋‹ค.

2. ํ•„์š”ํ•œ IBM Cloud ์„œ๋น„์Šค ์ƒ์„ฑํ•˜๊ธฐ

1.Watson Natural Language Understanding ์„œ๋น„์Šค ์ƒ์„ฑ

1)IBM Cloud > ์นดํƒˆ๋กœ๊ทธ > AI > Natural Language Understanding ์„ ํƒ
2)์„œ๋น„์Šค์ด๋ฆ„, ์ง€์—ญ : ๋ฏธ๊ตญ๋‚จ๋ถ€, ๋ฆฌ์†Œ์Šค๊ทธ๋ฃน : default, ๊ฐ€๊ฒฉ์ฑ…์ •ํ”Œ๋žœ : ๋ผ์ดํŠธ
3)create ํด๋ฆญ
4)์„œ๋น„์Šค ์‹ ์ž„์ •๋ณด ๋ณต์‚ฌํ•ด ๋‘๊ธฐ

2.Apache Spark ์„œ๋น„์Šค ์ƒ์„ฑ
1)IBM Cloud > ์นดํƒˆ๋กœ๊ทธ >ย  Web and Application > Apache Spark ์„ ํƒ
2)์„œ๋น„์Šค์ด๋ฆ„, ์ง€์—ญ : ๋ฏธ๊ตญ๋‚จ๋ถ€, ๋ฆฌ์†Œ์Šค๊ทธ๋ฃน : default, ๊ฐ€๊ฒฉ์ฑ…์ •ํ”Œ๋žœ : ๋ผ์ดํŠธ
3)create ํด๋ฆญ

3.DB2 Warehouse on Cloud ์„œ๋น„์Šค ์ƒ์„ฑ
1)IBM Cloud > ์นดํƒˆ๋กœ๊ทธ >ย  ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค > DB2 Warehouseย  ์„ ํƒ
2)์„œ๋น„์Šค์ด๋ฆ„,ย ์ง€์—ญ : ๋ฏธ๊ตญ๋‚จ๋ถ€, ๋ฆฌ์†Œ์Šค๊ทธ๋ฃน : default, ๊ฐ€๊ฒฉ์ฑ…์ •ํ”Œ๋žœ : ๋ผ์ดํŠธ
3)create ํด๋ฆญ
4)Manage > Open ํด๋ฆญ

4.Knowledge Catalog ์„œ๋น„์Šค ์ƒ์„ฑ
1)IBM Cloud > ์นดํƒˆ๋กœ๊ทธ > AI > Knowledge Catalog ์„ ํƒ
2)์„œ๋น„์Šค์ด๋ฆ„,์ง€์—ญ : ๋ฏธ๊ตญ๋‚จ๋ถ€, ๋ฆฌ์†Œ์Šค๊ทธ๋ฃน : Default, ๊ฐ€๊ฒฉ์ฑ…์ •ํ”Œ๋žœ : Lite
3)create ํด๋ฆญ

5.Watson Studio ์™€ Knowledge Catalog ์—ฐ๊ฒฐ
1)IBM Cloud > Dashboard > ์ƒ์„ฑํ•œ Watson Studio ์„œ๋น„์Šค ํด๋ฆญ >ย  Manage > Get Started ํด๋ฆญ
20Watson Studio Get Started ํ™”๋ฉด์—์„œ Watson Knowledge Studio ํด๋ฆญ > ์ƒ์„ฑํ•œ Knowledge Catalog ์„œ๋น„์Šค ์„ ํƒ

3. ๋ฐ์ดํ„ฐ ์ค€๋น„ํ•˜๊ธฐ

1.ํŠธ์œ„ํ„ฐ ๋ฐ์ดํ„ฐ

2.DB2 Warehouse on Cloud ์— ๋ฐ์ดํ„ฐ ์—…๋กœ๋“œ

1)IBM Cloud > Dashboard > ์ƒ์„ฑํ•œย  DB2 Warehouse on Cloud ์„œ๋น„์Šค ํด๋ฆญ>ย  Manage > Open ํด๋ฆญ
2)Loadย  ํด๋ฆญ > browser files ํด๋ฆญ > ํŒŒ์ผ ์„ ํƒ > Next ์„ ํƒย 
3)Schema ์„ ํƒ (์˜ˆ:DASH10461) > New Table ์„ ํƒ > Create a new Table ํƒญ ์•„๋ž˜์— ํ…Œ์ด๋ธ” ์ด๋ฆ„ ์ž…๋ ฅ(์˜ˆ:TB_TWEETS) > Create ํด๋ฆญ > Next ํด๋ฆญ
4)Separator์„ | ๋กœ ๋ณ€๊ฒฝ > Next ํด๋ฆญ > Begin Load ํด๋ฆญ
5)View Table ํด๋ฆญํ•˜์—ฌ ๋กœ๋“œ๋œ ๋ฐ์ดํ„ฐ ํ™•์ธย 

ย ย 

3.DB2์— ๋กœ๋“œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ํ”„๋กœ์ ํŠธ์—์„œ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด Knowledge Catalog ์— Asset ์œผ๋กœ ์ ์šฉ

ย ย 
1)Catalog ์ƒ์„ฑ
-.Catalog > New Catalog ์„ ํƒ >Catalog name ์ž‘์„ฑ (์˜ˆ:CTLG_TWEETS)
-.Cloud Object Storage ์„ ํƒ > Create ํด๋ฆญ
2)Connection ์ƒ์„ฑ
-.Add to Catalog > Connection ์„ ํƒ > ๊ธฐ์กด ์ƒ์„ฑํ•œ DB2 Warehouse ์„œ๋น„์Šค ์„ ํƒ > Create ํด๋ฆญ
-.์ƒ์„ฑ๋œ DB2 Warehouse Connection ํ™•์ธ
3)Connected Asset ์ƒ์„ฑ
-.Add to Catalog > Connected Asset ์„ ํƒ > Select Source ์„ ํƒย 
-.๊ธฐ์กด ์ƒ์„ฑํ•œ DB2 Warehouseย  Connection ์„ ํƒ > ์Šคํ‚ค๋งˆ ์„ ํƒ(์˜ˆ:DASH10461)ย  > ํ…Œ์ด๋ธ” ์„ ํƒ (์˜ˆ:TB_TWEETS) >
-.Select ํด๋ฆญ
-.Asset Name ์ž‘์„ฑ (์˜ˆ:AST_TB_TWEETS) > Add ํด๋ฆญย 

-.์ƒ์„ฑ๋œ Data Asset ํ™•์ธ

ย 

4.Watson NLU + Watson Studio ์‚ฌ์šฉํ•˜์—ฌ Sentiment ํ™•์ธํ•˜๊ธฐ

1.Notebooks ์ƒ์„ฑ

ย 
1)Notebooks์„ ์ž‘์„ฑํ•  ํ”„๋กœ์ ํŠธ ์„ ํƒ (์˜ˆ)WatsonStudioWithWatson
2)Settings ํƒญย  > Associated services > +Add service > Spark > Existing ํƒญ > ๊ธฐ์กด ์ƒ์„ฑํ•œ Apache Spark
3)์„œ๋น„์Šค ์„ ํƒ > Select ์„ ํƒ
4)New notebook ํด๋ฆญ
5)Notebook ์ด๋ฆ„ ์ž‘์„ฑ : TwitterAnalysisWithNLU
6)Runtime : ๊ธฐ์กด ์ƒ์„ฑํ•œ Apache Spark ์„œ๋น„์Šค ์„ ํƒ
7)Language ์„ ํƒ : Python 2 ์„ ํƒ
8)Create Notebook ํด๋ฆญ

ย 
2.Project์—์„œ ์‚ฌ์šฉํ•  Data assets ๊ฐ€์ ธ์˜ค๊ธฐ

ย 
1)์˜ค๋ฅธ์ชฝ Pane์—์„œ Catalog ์„ ํƒย 
2)์ƒ์„ฑํ•œ Catalog ์„ ํƒ : CTLG_TWEETSย 
3)์ƒ์„ฑํ•œ Asset ์„ ํƒ :ย  AST_TB_TWEETS
4)Add to Project ํด๋ฆญ
5)์—ฐ๊ฒฐํ•œ Catalog, Asset ํ™•์ธํ›„ Add ํด๋ฆญ
6)์™ผ์ชฝ Pane์—์„œ ๋“ฑ๋ก๋œ Data Assets ํ™•์ธ

ย 
3.Watson NLU ์‚ฌ์šฉํ•ด์„œ Sentiment ํ™•์ธํ›„ DB2์— ์—…๋กœ๋“œ ํ•˜๊ธฐ

1)์ƒ์„ฑํ•œ ๋…ธํŠธ๋ถ ์—ด๊ธฐ (์˜ˆ:TwitterAnalysisWithNLU)

Cell 1 – ์˜ค๋ฅธ์ชฝ Pane์—์„œ Asset ์„ ํƒ (์˜ˆ:AST_TB_TWEETS) > Insert Pandas DataFrame ์„ ํƒ
Cell 2 – NLU ํ˜ธ์ถœ ์œ„ํ•œ ๋ชจ๋“ˆ importย ย ย 
import json
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 import Features, EntitiesOptions,
KeywordsOptions, SentimentOptions, EmotionOptions
Cellย 3 -NLU ์„œ๋น„์Šค ์‹ ์ž„์ •๋ณด ๋ถ™์—ฌ๋„ฃ๊ธฐ
nlu = NaturalLanguageUnderstandingV1(
ย ย ย  version=’2018-03-16′,
ย ย ย  username=’๋‚ด์‹ ์ž„์ •๋ณด’,
ย ย ย  password=’๋‚ด์‹ ์ž„์ •๋ณด’ )
Cell 4 – ํŠธ์œ„ํ„ฐ ์ •๋ณด 1๊ฑด๋งŒ ์ถ”์ถœํ•ด์„œ NLU ์‹คํ–‰ ๊ฒฐ๊ณผ ๋ณด๊ธฐ
text = data_df_1.loc[0] features = Features(entities=EntitiesOptions(), keywords=KeywordsOptions(), emotion=EmotionOptions(), sentiment=SentimentOptions())

response = nlu.analyze(text=text, features=features,language=’en’ )

nlu_dump = json.dumps(response, indent=2)
print nlu_dump

Cell 5 – DATABASE์— ์ €์žฅ๋œ ์ „์ฒด ํŠธ์œ„ํ„ฐ ์ •๋ณด ์ฝ์–ด Sentiment Type๊ณผ Sentiment Score๋ฅผ ์ถ”์ถœ
overallSentimentScore = [] overallSentimentType = []

for row in data_df_1.iteritems():
ย ย ย  textย  = row[1] ย ย ย ย 
ย ย ย  enriched_json = nlu.analyze(text=text, features=features, language=’en’)
ย ย ย ย 
ย ย ย  if ‘sentiment’ in enriched_json:
ย ย ย ย ย ย ย  if(‘score’ in enriched_json[‘sentiment’][“document”]):
ย ย ย ย ย ย ย ย ย ย ย  overallSentimentScore.append(enriched_json[“sentiment”][“document”][“score”])
ย ย ย ย ย ย ย  else:
ย ย ย ย ย ย ย ย ย ย ย  overallSentimentScore.append(‘0’)

ย ย ย ย ย ย ย  if(‘label’ in enriched_json[‘sentiment’][“document”]):
ย ย ย ย ย ย ย ย ย ย ย  overallSentimentType.append(enriched_json[“sentiment”][“document”][“label”])
ย ย ย ย ย ย ย  else:
ย ย ย ย ย ย ย ย ย ย ย  overallSentimentType.append(‘0’)
ย ย ย  else:
ย ย ย ย ย ย ย  overallSentimentScore.append(‘0’)
ย ย ย ย ย ย ย  overallSentimentType.append(‘0’)

print overallSentimentType,overallSentimentScore

Cell 6 – Graph๋กœ ๋ณด๊ธฐ ์œ„ํ•ด ๊ฒฐ๊ณผ๋ฅผ DataFrame์œผ๋กœ ๋ณ€๊ฒฝํ•˜๊ธฐ
import pandas as pdย ย ย ย 
import numpy as np

data=np.column_stack((overallSentimentScore,overallSentimentType))
df = pd.DataFrame(data, columns=[“SentimentScore”, “SentimentType”])
print df

Cell 7 – Display ํ•ด ๋ณด๊ธฐ
import pixiedust
display(df)
Cell 8 – DB2์— ๋กœ๋“œํ•˜๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ ํƒ€์ž… ๋ณ€๊ฒฝํ•˜๊ธฐ
from pyspark.sql import SQLContext
sqlCtx = SQLContext(sc)
sdf = sqlCtx.createDataFrame(df)
Cell 9 – DB2์— ๋กœ๋“œํ•˜๊ธฐ
conn_properties = {
ย ย  ‘user’: ๋‚ด์ •๋ณด์ž…๋ ฅ,
ย ย  ‘password’: ๋‚ด์ •๋ณด์ž…๋ ฅ,
ย ย  ‘driver’: ‘com.ibm.db2.jcc.DB2Driver’
}

db2_jdbc_url = ‘๋‚ด์ •๋ณด์ž…๋ ฅ’

# Save Spark DataFrame to Db2 Warehouse
sdf.write.jdbc(db2_jdbc_url, ‘TB_TWEETS_SENTIMENT’, properties=conn_properties, mode=”overwrite”)

4.DB2 ์— ์ƒ์„ฑ๋œ ํ…Œ์ด๋ธ” ํ™•์ธํ•˜๊ธฐ

1)IBM Cloud > Dashboard > ์ƒ์„ฑํ•œ Db2 Warehouse on cloud ์„œ๋น„์Šค ์„ ํƒ > Manage > Open ํด๋ฆญ
2)Explorer > Schema ์„ ํƒ > ์ƒ์„ฑํ•œ ํ…Œ์ด๋ธ” ์„ ํƒ (์˜ˆ:TB_TWEETS_SENTIMENT) > View Data ํด๋ฆญ

ํ† ๋ก  ์ฐธ๊ฐ€

์ด๋ฉ”์ผ์€ ๊ณต๊ฐœ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ•„์ˆ˜ ์ž…๋ ฅ์ฐฝ์€ * ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.