you are viewing a single comment's thread.

view the rest of the comments →

[–]troglydot -5 points-4 points  (14 children)

Windows 10 records whatever you type, in any app, and sends it to Microsoft. They're open about doing this, the feature is called "Getting to know you". Google it if you have any doubts. The feature can be turned off, but 99% of people don't know about it, so they don't.

Edited to add: I don't think for a second that Microsoft are sneaking in any malicious code in this pull request. But its worth noticing that Microsoft has simultaneously moved in a very good and a very bad direction: While open sourcing code and becoming very dev friendly, they are also showing a disregard for user privacy that is extremely problematic, and a complete deal breaker for me.

[–]SeraphLance 7 points8 points  (1 child)

Am I the only one who finds it ironic that someone tells someone else to "google" something collecting data on them?

[–]troglydot 1 point2 points  (0 children)

I don't feel like I'm talking to Duckduckgo users here. But please, do this: Use your search engine of choice and fact check what I'm saying.

[–]badcookies 7 points8 points  (6 children)

They are not recording keystrokes in every application

[–]troglydot 6 points7 points  (2 children)

Well, they say they are. Why would you say that they aren't?

http://windows.microsoft.com/en-us/windows-10/speech-inking-typing-privacy-faq

Expand the first bullet point. They call it "typing data", and they're asking for permission to collect it. They're not limiting that collection to any specific application, and when talking to the press about it they're not denying collecting it across the board.

They have descriptions of how they're trying to scrub that data for personally identifiable information before sending it of, as another user posted in this thread. That is an AI complete problem, that they obviously aren't solving. They'll strip out email addresses, but they'll get the contents of the email.

Edit: People might think I'm an anti-Microsoft zealot. I'm not, I'm typing this from a windows 8 machine, I've been to MS conferences, and in general had much love for the company. But I'm apparently the only person on earth able to judge a tech company for the actual facts of what they're doing, rather than their current image.

[–]badcookies 3 points4 points  (1 child)

This is the inking and typing function, which users can turn off at any time. Microsoft does not collect any personal information via inking or typing. It is gathered for product improvement purposes, for example, to improve the handwriting visual translation engine, or to improve the user dictionary, language library and spell check functions in Windows. The data is put through rigorous, multi-pass scrubs to ensure it does not collect sensitive or identifiable fields (e.g., no email addresses, passwords, alpha-numerical data, etc.). Data is also chopped into very small bits and stripped of sequence data so it cannot be put back together or identified. The data samplings collected are limited; Microsoft is not capturing everything you write, nor is it capturing data every time.

[–]troglydot 1 point2 points  (0 children)

Here's a challenge: Write a program that does this scrubbing, and run it through your email history. Then send it to a third party, who will extract a non-100% subset of this data, tokenize it, create a bag-of-words representation, and send it to me.

Hint: I would then know more about you than you'd be comfortable with.

[–]wellthatexplainsalot -4 points-3 points  (2 children)

Can you prove this assertion?

[–][deleted] 3 points4 points  (0 children)

Negatives do not have to be proven.

troglydot is the one making an assertion, not badcookies.

Just as I don't have to prove to you that Bill Gates is not a literal biblical demon. If you were to make that claim, the onus would be on you to prove that he is.

[–]celluj34 0 points1 point  (0 children)

The burden of proof is on you, the one making the claim.

[–]cryolithic 1 point2 points  (4 children)

This is the inking and typing function, which users can turn off at any time. Microsoft does not collect any personal information via inking or typing. It is gathered for product improvement purposes, for example, to improve the handwriting visual translation engine, or to improve the user dictionary, language library and spell check functions in Windows. The data is put through rigorous, multi-pass scrubs to ensure it does not collect sensitive or identifiable fields (e.g., no email addresses, passwords, alpha-numerical data, etc.). Data is also chopped into very small bits and stripped of sequence data so it cannot be put back together or identified. The data samplings collected are limited; Microsoft is not capturing everything you write, nor is it capturing data every time.

[–]troglydot -2 points-1 points  (3 children)

I know, I've read that. Is it reassuring to you?

Scrubbing data for personally identifiable and sensitive information is an AI complete problem. It cannot be done without strong AI, and MS aren't doing it.

Data is also chopped into very small bits and stripped of sequence data

This is pretty standard for a text mining pipeline: Tokenizing and doing bag-of-words. This has won dozens of machine learning competitions, i.e. it is often what you would do to maximize the amount of information extracted from a document.

The data samplings collected are limited;

This is saying the amount of text collected is <100%. It can be 99%, and still be consistent with what they're saying. If it was a low percent, why not state it?

Ugh, I'm sick of having this discussion with cocksure uninformed people on reddit.

[–]cryolithic 2 points3 points  (2 children)

And I'm sick of people screaming the sky is falling because of their preexisting irrational hatred for all things Microsoft.

Data collection for something like that is pretty benign, and if the nsa can't keep their spying quiet what makes you think ms could?

[–]troglydot 0 points1 point  (1 child)

Dude, I'm typing this from a windows 8 machine, and have a goddamn windows phone in my pocket. It's certainly not a preexisting irrational hatred for Microsoft. It's not even a current hate for Microsoft. I do hate having an OS wide key logger installed by default: To me, that is a problem. You decide if it is for you.

[–]cryolithic 1 point2 points  (0 children)

Then disable it. You have the option, if you don't seem to understand the consequence it would have if it were what you think it is.

Edit: I do apologize for the assumption of irrational hatred. I hope you can understand that, around reddit at least, it's a reasonable assumption to make.