The Joy of Stats (BBC FOUR)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Bryn
    Banned
    • Mar 2007
    • 24688

    The Joy of Stats (BBC FOUR)

    Do catch this on the iPlayer, Apart from the general thrust, some of the particulars are truly fascinating.
  • ferneyhoughgeliebte
    Gone fishin'
    • Sep 2011
    • 30163

    #2
    Originally posted by Bryn View Post
    Do catch this on the iPlayer, Apart from the general thrust, some of the particulars are truly fascinating.
    [FONT=Comic Sans MS][I][B]Numquam Satis![/B][/I][/FONT]

    Comment

    • Serial_Apologist
      Full Member
      • Dec 2010
      • 37814

      #3
      I have to take one of those every day. There's no joy in them...

      Comment

      • Bryn
        Banned
        • Mar 2007
        • 24688

        #4
        Originally posted by Serial_Apologist View Post
        I have to take one of those every day. There's no joy in them...
        Thanks for the reminder. Nearly forgot to take mine, along with the evening Acrete D3.

        Comment

        • Richard Tarleton

          #5
          Fascinating, thanks, caught most of it, must watch again. They said that Google translate is based on statistics, but didn't explain how this works - did they mean frequency with which words occur, sounds, structures - what, exactly? Did you grasp it Bryn?

          Comment

          • Bryn
            Banned
            • Mar 2007
            • 24688

            #6
            Originally posted by Richard Tarleton View Post
            Fascinating, thanks, caught most of it, must watch again. They said that Google translate is based on statistics, but didn't explain how this works - did they mean frequency with which words occur, sounds, structures - what, exactly? Did you grasp it Bryn?
            No/ I too must watch again.

            Comment

            • jean
              Late member
              • Nov 2010
              • 7100

              #7
              Originally posted by Bryn View Post
              Your link was all in Welsh!

              This one works better for me.

              Comment

              • french frank
                Administrator/Moderator
                • Feb 2007
                • 30456

                #8
                Originally posted by Richard Tarleton View Post
                Fascinating, thanks, caught most of it, must watch again. They said that Google translate is based on statistics, but didn't explain how this works - did they mean frequency with which words occur, sounds, structures - what, exactly? Did you grasp it Bryn?
                A 4-minute OU video. Does it say anything more (it's the closest I'll get to seeing the BBC Four prog)? There may be other stuff 'out there'.

                For more like this subscribe to the Open University channel https://www.youtube.com/channel/UCXsH4hSV_kEdAOsupMMm4QwFree learning from The Open University - ...
                It isn't given us to know those rare moments when people are wide open and the lightest touch can wither or heal. A moment too late and we can never reach them any more in this world.

                Comment

                • Richard Tarleton

                  #9
                  Originally posted by french frank View Post
                  A 4-minute OU video. Does it say anything more (it's the closest I'll get to seeing the BBC Four prog)? There may be other stuff 'out there'.

                  https://www.youtube.com/watch?v=AEac-jP5Eho
                  No, that was it for that segment. I didn't understand what Mr Ochs meant by "statistical correlations" - unless you feed the same passage in in different languages - but then there wouldn't be any need for translations, would there? The presenter clearly understood, and didn't realise that we might not.

                  Comment

                  • Richard Tarleton

                    #11
                    Thanks! I shall read this carefully later.

                    Comment

                    • Serial_Apologist
                      Full Member
                      • Dec 2010
                      • 37814

                      #12
                      A programme intended for the more statisticated among us, I think.

                      Comment

                      • ferneyhoughgeliebte
                        Gone fishin'
                        • Sep 2011
                        • 30163

                        #13
                        Originally posted by Serial_Apologist View Post
                        A programme intended for the more statisticated among us, I think.
                        - it may make many "converts"; perhaps particularly from those who discover that they have more than the average number of legs?
                        [FONT=Comic Sans MS][I][B]Numquam Satis![/B][/I][/FONT]

                        Comment

                        • Pianorak
                          Full Member
                          • Nov 2010
                          • 3128

                          #14
                          Originally posted by Bryn View Post
                          Do catch this on the iPlayer, Apart from the general thrust, some of the particulars are truly fascinating.
                          If you like that then there is probably a statistical likelihood you might also be interested in Tim Harford's “More or Less” programme on R4.
                          My life, each morning when I dress, is four and twenty hours less. (J Richardson)

                          Comment

                          • Richard Tarleton

                            #15
                            I think I understand - it looks for statistical correlations/similarities between documents already translated by humans....

                            Google Translate doesn't know about language at all. This is, roughly speaking, how Google Translate works: Google has gathered a huge database of human-revised translations of millions of documents. They include records since 1957 by the EU in two dozen languages including official communications made by the UN and its agencies in its six official languages. These texts were the basis of the first Google Translate algorithms, yet more recent versions have also incorporated records of international tribunals, company reports, besides all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments.

                            The diversity of input content is what allowed Google to include more than 90 languages in its catalog, including Catalan, Hebrew, Mongolian, Urdu and Zulu among many others [6]. The algorithm is designed to look for patterns among its huge collection of translated texts, in order to find the translation most likely to be associated with the text you entered [7] [8]. The idea is that the text you wish to translate may have probably been written and translated before, somewhere among the huge data set of translated documents that Google has collected. It may even appear many times. By detecting patterns in documents that have already been translated by human translators, the algorithm calculates probabilities as to what an appropriate translation should be. The algorithm then provides the most likely translation....

                            [the exact algorithm is a secret but...] Google Translate has some great advantages that no other translators have. First of all, any user around the world can correct Google's translation, thus adding additional information to the corpus that feeds the translation model. Remember that the model's input are collections of texts already translated by humans so every single correction counts as a new piece of data. Another unique tool that Google can use to improve its translation is its web search.......

                            The fact that Google Translate relies mostly on previously translated texts is also its weak point. Google does not directly translate from Language 1 to Language 2, at least not most of the time. Consider that many of the translation pairs offered, Korean to Urdu for instance, have no history of translation between them, and therefore no paired texts, neither on the web nor anywhere else. As a result, the algorithm usually translates the entered text to English, the language with the highest number of translation relationships. The text is then translated from English to any language you want [8]. In some cases, there's a fourth language involved. For instance, if you wish to translate a phrase in Catalan, the text would be first translated to Spanish, then to English and then to any language you may want. This intricate process is very prone to error, which explains why Google Translate is good with some languages, yet barely capable with others. The algorithms rely on the amount and quality of the translated documents that work as input for the process, the same way that the quality of raw data becomes key to the success of any model......

                            Comment

                            Working...
                            X