Watson Analyzes Your Writing

LeoLancaster · May 11, 2015

Django said:
I doubt it's just for fun, they're probably using all the inputs to improve it's linguistic skills and make it learn more. They'll need a ridiculous amount of data to get any useful results out of it and what better way to get real data than to appeal to our human nature and desire to analyse ourselves!

So we should spam it with analyses to teach Watson to play Pokemon y/n

Ace Emerald · May 23, 2015

Texas Cloverleaf said:
I threw in a reflection paper I had written for a class and the results were actually scarily accurate. Not just the blurbs since those are more standardized but the percentage breakdown confirmed a lot of what I already knew and helped make clear a couple more things. Was very impressed with it. Definitely had to be a personal reflection paper to work properly though, other papers with more of a purpose yielded similar but skewed results.

I almost want to write something like this just to get it analyzed. Academic papers give (I think) a good description of certain facets of my personality, but the emotional section is butchered. On the other hand, Facebooks posts give a much better emotional description, but ruins other categories. A reflection paper or journal entry of some sort seems ideal.

Raven · May 28, 2015

Ace Emerald said:
A reflection paper or journal entry of some sort seems ideal.

Just so happened I had a very recent reflective essay to hand.

Big 5
Openness
98%
Adventurousness
93%
Artistic interests
41%
Emotionality
15%
Imagination
88%
Intellect
97%
Authority-challenging
94%
Conscientiousness
88%
Achievement striving
90%
Cautiousness
93%
Dutifulness
29%
Orderliness
32%
Self-discipline
41%
Self-efficacy
96%
Extraversion
17%
Activity level
29%
Assertiveness
46%
Cheerfulness
6%
Excitement-seeking
1%
Outgoing
24%
Gregariousness
2%
Agreeableness
40%
Altruism
45%
Cooperation
75%
Modesty
7%
Uncompromising
27%
Sympathy
100%
Trust
36%
Emotional range
27%
Fiery
35%
Prone to worry
12%
Melancholy
13%
Immoderation
0%
Self-consciousness
4%
Susceptible to stress
27%
Needs
Challenge
96%
Closeness
100%
Curiosity
97%
Excitement
70%
Harmony
50%
Ideal
11%
Liberty
98%
Love
2%
Practicality
1%
Self-expression
70%
Stability
4%
Structure
88%
Values
Conservation
0%
Openness to change
96%
Hedonism
5%
Self-enhancement
70%
Self-transcendence
98%

You are analytical, restrained and tranquil.

You are empathetic: you feel what others feel and are compassionate towards them. You are self-controlled: you have control over your desires, which are not particularly intense. And you are calm-seeking: you prefer activities that are quiet, calm, and safe.

Your choices are driven by a desire for belongingness.

You are relatively unconcerned with tradition: you care more about making your own path than following what others have done. You consider helping others to guide a large part of what you do: you think it is important to take care of the people around you.

This is... uh... creepily close... The only things it gets wrong are things I deliberately altered in the essay to give that feel. Wow.

HelenTheHero · Aug 3, 2015

A report I wrote on how Minecraft can be educational for a school project. (Is it bad I forgot what it was for? lol)

Openness
66%
Adventurousness
65%
Artistic interests
66%
Emotionality
29%
Imagination
85%
Intellect
85%
Authority-challenging
12%
Conscientiousness
30%
Achievement striving
38%
Cautiousness
65%
Dutifulness
69%
Orderliness
1%
Self-discipline
22%
Self-efficacy
62%
Extraversion
88%
Activity level
40%
Assertiveness
95%
Cheerfulness
60%
Excitement-seeking
60%
Outgoing
95%
Gregariousness
94%
Agreeableness
63%
Altruism
91%
Cooperation
48%
Modesty
14%
Uncompromising
43%
Sympathy
94%
Trust
68%
Emotional range
21%
Fiery
42%
Prone to worry
11%
Melancholy
56%
Immoderation
17%
Self-consciousness
39%
Susceptible to stress
1%
Needs
Challenge
19%
Closeness
39%
Curiosity
2%
Excitement
31%
Harmony
22%
Ideal
1%
Liberty
41%
Love
85%
Practicality
65%
Self-expression
6%
Stability
5%
Structure
6%
Values
Conservation
60%
Openness to change
33%
Hedonism
51%
Self-enhancement
71%
Self-transcendence

Darth Manaphy · Aug 10, 2015

I'm writing a journal paper on how to improve a computer's comprehension of patent documents and I fed it my first draft of the introduction seen below:

In order to understand the scope of a patent, a reader needs to understand the claims of the patent, because patent claims define the legal scope of the patent. However, most researchers tend to focus on merging standard IR methods on all or part of the patent with IR methods that involve the citations of the patents.

However, features used in standard IR methods do not perform as well as more complicated features that keep the semantic representation of the document intact. One such feature is a dependency triplicate consisting of two words and their grammatical relationship. Learning these structures of the document will give the computer more insight into the document and be able to perform better analysis using it. Using these structures in patent text classification has proven to provide better performance than simple n-grams.

Even though patent claims are written in English, they do not read like standard English. The main scope of the invention must be described in one sentence. Patent agents and attorneys must be able to cover all possible forms of the invention in one run-on sentence with very exacting language.

These constraints, along with other rules, make patent claims difficult to parse. Off-the-shelf NLP parsers such as the Stanford Parser do not provide correct parse trees for most patent claims. As seen in the figure, the Stanford Parser does not correctly label ``said'' as an adjective in any instance and the parsing degrades as a result. This is a result of the databases used to train the software's model of English - the Wall Street Journal corpus.

Researchers have written about how to avoid this problem. One common method to fix this problem that has been adopted by the patent parsing community is by chunking the long patent claims into smaller segments. This method also helps to avoid the time and memory requirements to parse long sentences like most patent claims. However, even with smaller segments, the Stanford Parser does not perform well.

The only method that guarantees a better parsing is to train a grammatical model on patents and use that model in the parser. However, in order to train this model, a hand-annotated corpus of patent claims needs to be used. Creating this corpus requires extensive development time and resources and is thus infeasible.

The Stanford Parser provides a tool, however, to force certain words to be tagged with certain parts of speech (POS) tags. As can be seen in the figure, by just correcting the incorrect verbs tags (over multiple iterations sometimes) and rerunning the Stanford Parser, a correct parsing can be obtained.

A system that will automatically correct these incorrect POS tags was developed. It learned the properties of the incorrectly tagged words and what tags they should be labeled via a simple SVM-based system. With this information, it automatically corrects incorrect POS tags in other patent claims.

A corpus of words that were incorrectly labeled as a verb as well as their correct tags was gathered for this system to learn. By reducing the complexity from obtaining the POS tags of all words in a patent claim to just the tags of words originally labeled as a verb, the task of assembling this corpus became a problem for which Amazon Mechanical Turk (AMT) could provide a feasible solution.

To show that this technique has merit, a simple patent subject classification problem was developed. The mature field of patent subject classification *cite several papers here* is perfect for demonstrating how such a simple fix in parsing patent claims is enough to provide a better performance. This field involves classifying new patents into one of several different categories or subcategories.

In this paper, an overview of our AMT campaign and its evolution over time will be provided with the statistics of the results of each stage of the campaign. An overview of the automatic POS tag corrector as well as its performance when given data from our AMT campaign will be given. Finally, a presentation of our system used in patent subject classification will be given.

It spit out this:

Big 5
Openness
100%
Adventurousness
98%
Artistic interests
18%
Emotionality
2%
Imagination
100%
Intellect
100%
Authority-challenging
100%
Conscientiousness
77%
Achievement striving
84%
Cautiousness
90%
Dutifulness
4%
Orderliness
0%
Self-discipline
20%
Self-efficacy
83%
Extraversion
3%
Activity level
0%
Assertiveness
7%
Cheerfulness
0%
Excitement-seeking
1%
Outgoing
0%
Gregariousness
0%
Agreeableness
1%
Altruism
1%
Cooperation
42%
Modesty
1%
Uncompromising
1%
Sympathy
100%
Trust
2%
Emotional range
21%
Fiery
19%
Prone to worry
2%
Melancholy
31%
Immoderation
2%
Self-consciousness
27%
Susceptible to stress
0%
Needs
Challenge
93%
Closeness
4%
Curiosity
20%
Excitement
1%
Harmony
7%
Ideal
10%
Liberty
1%
Love
7%
Practicality
11%
Self-expression
7%
Stability
4%
Structure
15%
Values
Conservation
1%
Openness to change
87%
Hedonism
15%
Self-enhancement
76%
Self-transcendence
46%

It helps to know that I'm not being too emotional here at least. That's always a step to writing a good scientific paper.

Fluke · Aug 10, 2015

A paper I wrote on the Dunning-Kruger effect

You are shrewd, skeptical and can be perceived as indirect.

You are philosophical: you are open to and intrigued by new ideas and love to explore them. You are unstructured: you do not make a lot of time for organization in your daily life. And you are unconcerned with art: you are less concerned with artistic or creative activities than most people who participated in our surveys.

Your choices are driven by a desire for organization.

You consider achieving success to guide a large part of what you do: you seek out opportunities to improve yourself and demonstrate that you are a capable person. You are relatively unconcerned with tradition: you care more about making your own path than following what others have done.

BenTheDemon · Aug 10, 2015

I don't need a bot to tell me how shit my writing is. If it's a subject I'm interested in, I can write about it decently, but for shit that I don't come close to giving a fuck about (like College papers), I struggle.
I failed English 101 in College because I cannot articulate ideas at all when I'm bullshitting a paper.
That's the biggest reason I dropped out of College. I can pass Math, Science, and basically everything else with high A's, but I simply cannot pass a writing class, which is usually required.

Watson Analyzes Your Writing

LeoLancaster

does this still work

Ace Emerald

Cyclic, lunar, metamorphosing

Raven

Esto es el fin.

HelenTheHero

Darth Manaphy

Fluke

BenTheDemon

Banned deucer.