The UI is slick, although creating the puzzle remains on you.
Certainly suitable for string answers, XQuery/XPath/XSLT expressions, etc.
Build Your Own Text Editor by Jeremy Ruten.
From the webpage:
Welcome! This is an instruction booklet that shows you how to build a text editor in C.
The text editor is antirez’s kilo, with some changes. It’s about 1000 lines of C in a single file with no dependencies, and it implements all the basic features you expect in a minimal editor, as well as syntax highlighting and a search feature.
This booklet walks you through building the editor in 184 steps. Each step, you’ll add, change, or remove a few lines of code. Most steps, you’ll be able to observe the changes you made by compiling and running the program immediately afterwards.
I explain each step along the way, sometimes in a lot of detail. Free free to skim or skip the prose, as the main point of this is that you are going to build a text editor from scratch! Anything you learn along the way is bonus, and there’s plenty to learn just from typing in the changes to the code and observing the results.
See the appendices for more information on the tutorial itself (including what to do if you get stuck, and where to get help).
If you’re ready to begin, then go to chapter 1!
… (emphasis in original)
I mention this tutorial because:
Of the three, the “make changes, see the results” approach is probably the most important.
Examples that “just work” are great and I look for them all the time. 😉
But imagine examples that take you down the false leads and traps, allowing you to observe the cryptic error messages from XQuery for example. You do work your way to a solution but are not given one out of the box.
“Cryptic” is probably overly generous with regard to XQuery error messages. Suggestions of a better one word term, usable in mixed company for them?
Mindstorms by Seymour Papert.
From the webpage:
Seymour Papert’s Mindstorms was published by Basic Books in 1980, and outlines his vision of children using computers as instruments for learning. A second edition, with new Forewords by John Sculley and Carol Sperry, was published in 1993. The book remains as relevant now as when first published almost forty years ago.
The Media Lab is grateful to Seymour Papert’s family for allowing us to post the text here. We invite you to add your comments and reflections.
From the introduction:
…I believe that certain uses of very powerful computational technology and computational ideas can provide children with new possibilities for learning, thinking, and growing emotionally as well as cognitively….
Toyama gives numerous examples that dispel any naive faith in technology as a cure for social issues.
Given the near ubiquitous presence of computers in first world countries, how do you account for the lack of children with
…new possibilities for learning, thinking, and growing emotionally as well as cognitively….
If new learning or thinking has developed, it’s being very well hidden in national and international news reports.
Not for you but an interesting resource for introducing children to working with data.
The network template is a csv file with a header, two fields separated by commas.
Pick the right text/examples and you could have a class captivated pretty quickly.
From the webpage:
Start doing data science in minutes
As a data scientist, you don’t want to waste your time installing software. Our goal is to provide a virtual environment that will enable you to start doing data science in a matter of minutes.
As a teacher, author, or organization, making sure that your students, readers, or members have the same software installed is not straightforward. This open source project will enable you to easily create custom software and data bundles for the Data Science Toolbox.
A virtual environment for data science
The Data Science Toolbox is a virtual environment based on Ubuntu Linux that is specifically suited for doing data science. Its purpose is to get you started in a matter of minutes. You can run the Data Science Toolbox either locally (using VirtualBox and Vagrant) or in the cloud (using Amazon Web Services).
We aim to offer a virtual environment that contains the software that is most commonly used for data science while keeping it as lean as possible. After a fresh install, the Data Science Toolbox contains the following software:
- Python, with the following packages: IPython Notebook, NumPy, SciPy, matplotlib, pandas, scikit-learn, and SymPy.
- R, with the following packages: ggplot2, plyr, dplyr, lubridate, zoo, forecast, and sqldf.
- dst, a command-line tool for installing additional bundles on the Data Science Toolbox (see next section).
Let us know if you want to see something added to the Data Science Toolbox.
Great resource for doing or teaching data science!
And an example of using a VM to distribute software in a learning environment.
Ignorance of Apache OpenMeetings is the only explanation I can offer for non-Apache Openmeetings webinars with one presenter, listeners and a chat channel.
Proprietary solutions limit your audience’s choice of platforms, while offering no, repeat no advantages over Apache OpenMeetings.
It may be that your IT department is too busy creating SQLi weaknesses to install and configure Apache OpenMeetings, but even so that’s a fairly poor excuse for not using it.
If you just have to spend money to “trust” software, there are commercial services that offer hosting and other services for Apache OpenMeetings.
Apologies, sort of, for the Wednesday rant, but I tire of limited but “popular logo” commercial services used in place of robust open source solutions.
From the post:
A century of research shows that traditional grammar lessons—those hours spent diagramming sentences and memorizing parts of speech—don’t help and may even hinder students’ efforts to become better writers. Yes, they need to learn grammar, but the old-fashioned way does not work.
This finding—confirmed in 1984, 2007, and 2012 through reviews of over 250 studies—is consistent among students of all ages, from elementary school through college. For example, one well-regarded study followed three groups of students from 9th to 11th grade where one group had traditional rule-bound lessons, a second received an alternative approach to grammar instruction, and a third received no grammar lessons at all, just more literature and creative writing. The result: No significant differences among the three groups—except that both grammar groups emerged with a strong antipathy to English.
There is a real cost to ignoring such findings. In my work with adults who dropped out of school before earning a college degree, I have found over and over again that they over-edit themselves from the moment they sit down to write. They report thoughts like “Is this right? Is that right?” and “Oh my god, if I write a contraction, I’m going to flunk.” Focused on being correct, they never give themselves a chance to explore their ideas or ways of expressing those ideas. Significantly, this sometimes-debilitating focus on “the rules” can be found in students who attended elite private institutions as well as those from resource-strapped public schools.
(Three out of five links here are pay-per-view. Sorry.)
It’s only a century of research. Don’t want to rush into anything. 😉
How would you adapt this finding to teaching programming and/or hacking?
From the post:
Racket is a programming language in the Lisp tradition that is different from other programming languages in a few important ways. It can be any language you want – because Racket is heavily used for pedagogy, it has evolved into a suite of languages and tools that you can use to explore as many different programming paradigms as you can think of. You can also download it and play with it right now, without installing anything else, or knowing anything at all about computers or programming. Watching Matthias Felleisen’s “big-bang: the world, universe, and network in the programming language” talk will give you an idea of how Racket can be used to help people learn how to think about mathematics, computation, and more. Try it out even if you “hate Lisp” or “don’t know how to program” – it’s really a lot of fun.
Aaron and Michael scooped President Obama’s computer science skills for everyone by a day:
President Barack Obama said Saturday he will ask Congress for billions of dollars to help students learn computer science skills and prepare for jobs in a changing economy.
“In the new economy, computer science isn’t an optional skill. It’s a basic skill, right along with the three R’s,” Obama said in his weekly radio and Internet address….(Obama Wants $4B to Help Students Learn Computer Science)
The “computer science for everyone” is a popular chant but consider the Insecure Internet of Things (IIoT).
Will minimal computer science skills increase or decrease the level of security for the IIoT?
That’s what I think too.
Removal of IoT components is the only real defense. Expect a vibrant cottage industry to grow up around removing IoT components.
From the post:
If you know regular expressions, you might find this to be geek fun. A friend of mine posted this, without a solution, but once I started working it, it seemed put together well enough it was likely solvable. Eventually I did solve it, but not before coding up a web interface for verifying my solution and rotating the puzzle in the browser, which I recommend using if you are going to try this out. Or just print it out.
It’s actually quite impressive of a puzzle in it’s own right. It must have taken a lot of work to create.
The image is a link to the interactive version with the rules.
Other regex crossword puzzle resources:
RegHex – An alternative web interface to help solve the MIT hexagonal regular expression puzzle.
Regex Crossword – Starting with a tutorial, the site offers 9 levels/types of games, concluding with five (5) hexagonal ones (only a few blocks on the first one and increasingly complex).
In case you need help with some of the regex puzzles, you can try: Awesome Regex – A collection of regex resources.
This paper discusses an approach to representing and reasoning about constraints over strings. We discuss how many string domains can often be concisely represented using regular languages, and how constraints over strings, and domain operations on sets of strings, can be carried out using this representation.
Each regex clue you add is a constraint on all the intersecting cells. Your first regex clue is unbounded, but every clue after that has a constraint. Wait, that’s not right! Constraints arise only when cells governed by different regexes intersect.
Anyone interested in going beyond hexagons and/or 2 dimensions?
I first saw this in a tweet by Alexis Lloyd.
Data Science Learning Club by Renee Teate.
From the Hello and welcome message:
I’m Renee Teate, the host of the Becoming a Data Scientist Podcast, and I started this club so data science learners can work on projects together. Please browse the activities and see what we’re up to!
What is the Data Science Learning Club?
This learning club was created as part of the Becoming a Data Scientist Podcast [coming soon!]. Each episode, there is a “learning activity” announced. Anyone can come here to the club forum to get details and resources, participate in the activity, and share their results.
Participants can use any technology and any programming language to do the activities, though I expect most will use python or R. No one is “teaching” how to do the activity, we’ll just share resources and all do the activity during the same time period so we can help each other out if needed.
How do I participate?
Just register for a free account, and start learning!
If you’re joining in a “live” activity during the 2 weeks after a podcast episode airs (the original “assignment” period listed in the forum description), then you can expect others to be doing the activity at the same time and helping each other out. If you’re working through the activities from the beginning after the original assignment period is over, you can browse the existing posts for help and you can still post your results. If you have trouble, feel free to post a question, but you may not get a timely response if the activity isn’t the current one.
- If you are brand new to data science, you may want to start at activity 00 and work your way through each activity with the help of the information in posts by people that did it before you. I plan to make them increase in difficulty as we go along, and they may build on one another. You may be able to skip some activities without missing out on much, and also if you finish more than 1 activity every 2 weeks, you will be going faster than new activities are posted and will catch up.
- If you know enough to have done most of the prior activities on your own, you don’t have to start from the beginning. Join the current activity (latest one posted) with the “live” group and participate in the activity along with us.
- If you are more advanced, please join in anyway! You can work through activities for practice and help out anyone that is struggling. Show off what you can do and write tutorials to share!
If you have challenges during the activity and overcome them on your own, please post about it and share what you did in case others come across the same challenges. Once you have success, please post about your experience and share your good results! If you write a post or tutorial on your own blog, write a brief summary and post a link to it, and I’ll check it out and promote the most helpful ones.
The only “dues” for being a member of the club are to participate in as many activities as possible, share as much of your work as you can, give constructive feedback to others, and help each other out as needed!
I look forward to this series of learning activities, and I’ll be participating along with you!
Renee’s Data Science Learning Club is due to go live on December 14, 2015!
With the various free courses, Stack Overflow and similar resources, it will be interesting to see how this develops.
Hopefully recurrent questions will develop into tutorials culled from discussions. That hasn’t happened with Stack Overflow, not that I am aware of, but perhaps it will happen here.
Stop by and see how the site develops!
From the post:
Stanford math education professor Jo Boaler spends a lot of time worrying about how math education in the United States traumatizes kids. Recently, a colleague’s 7-year-old came home from school and announced he didn’t like math anymore. His mom asked why and he said, “math is too much answering and not enough learning.”
This story demonstrates how clearly kids understand that unlike their other courses, math is a performative subject, where their job is to come up with answers quickly. Boaler says that if this approach doesn’t change, the U.S. will always have weak math education.
“There’s a widespread myth that some people are math people and some people are not,” Boaler told a group of parents and educators gathered at the 2015 Innovative Learning Conference. “But it turns out there’s no such thing as a math brain.” Unfortunately, many parents, teachers and students believe this myth and it holds them up every day in their math learning.
Intriguing article that suggests the solution to the lack of students in computer science and mathematics may well be to work on changing the attitudes of students…about themselves as computer science or mathematics students.
Something to remember when users are having a hard time grasping your explanation of semantics and/or topic maps.
Oh, another high point in the article, our brains physically swell and shrink:
Neuroscientists now know that the brain has the ability to grow and shrink. This was demonstrated in a study of taxi drivers in London who must memorize all the streets and landmarks in downtown London to earn a license. On average it takes people 12 tries to pass the test. Researchers found that the hippocampus of drivers studying for the test grew tremendously. But when those drivers retired, the brain shrank. Before this, no one knew the brain could grow and shrink like that.
It is only year two of the Human Brain Project and now we know that one neuron can have thousands of synapses and now that the infrastructure of the brain grows and shrinks. Information that wasn’t available at its start.
How do you succeed when the basic structure to be modeled keeps changing?
Perhaps that is why the Human Brain Project has no defined measure of “success”, other than spending all the allotted funds over a ten year period. That I am sure they will accomplish.
From the post:
Summary: Would you like to optimize your learning of Clojure? Would you like to focus on learning only the most useful parts of the language first? Take this lesson from second language learning: learn the expressions in order of frequency of use.
When I was learning Spanish, I liked to use Anki to drill new vocabulary. It’s a flashcard program. I found that someone had made a set of cards from an analysis of thousands of newspapers. They read in all of the words from the newspapers, counted them up, and figured out what the most common words were. The top 1000 made it into the deck.
It turns out that this is a very good strategy for learning words. Word frequency follows a hockey stick distribution. The most common words are used so much more than the less common words. For instance, the 100 most common English words make up more than 50% of text. If you’ve got limited time, you should learn those most common words first.
People who are trying to learn Clojure have been asking me “how do I learn all of this stuff? There’s so much!” It’s a valid question and I haven’t had a good answer. I remembered the Spanish newspaper analysis and I thought I’d try to do a similar analysis of Clojure expressions.
Is Eric seriously suggesting using lessons learned in another field? 😉
Of course, for a CS conference using the top 100 most common Clojure expressions would have a title similar to:
Use of High Frequency Terminology Repetition: A Small Group Study (maybe 12 participants)
You could, of course, skip waiting for a conference presentation with a title like that one, followed by peer reviewed paper(s), more conference presentations and its final appearance in a collection of potential ways to improve CS instruction.
Let me know if Eric’s suggestion works for you.
PS: Thanks Eric!
From the post:
I was first introduced to the idea of problem-solution ordering issues by Richard Lemarchand, one of my game design professors. The idea stuck with me, mostly because it provided a satisfying explanation for a certain confusing pattern of player behavior that I’d witnessed many times in the past.
Here’s the pattern. A new player jumps into your game and starts bouncing around your carefully crafted tutorial level. The level funnels them to the key, which they collect, and then on to the corresponding locked door, which they successfully open. Then, somewhere down the road, they encounter a second locked door… and are completely stumped. They’ve solved this problem once before – why are they having such a hard time solving it again?
What we have here is a problem-solution ordering issue. Because the player got the key in the first level before encountering the locked door, they never really formed an understanding of the causal link between “get key” and “open door”. They got the key, and then some other stuff happened, and then they reached the door, and were able to open it; but “acquiring the key” and “opening the door” were stored as two separate, disconnected events in the player’s mind.
If the player had encountered the locked door first, tried to open it, been unable to, and then found the key and used it to open the door, the causal link would be unmistakable. You use the key to open the locked door, because you can’t open the locked door without the key.
This problem becomes a lot more obvious when you don’t call the key a key, or when the door doesn’t look like a locked door. The “key/door” metaphor is widely understood and frequently used in video games, so many players will assume that you use a key to open a locked door even if your own game doesn’t do a great job of teaching them this fact. But if the “key” is really a thermal detonator and the “door” is really a power generator, a lot of players are going to wind up trying to destroy the second generator they encounter by whacking it ineffectually with a sword.
Max goes on to apply problem-solution ordering to teaching both math and monads.
I don’t recall seeing or writing any topic map materials that started with concrete problems that would be of interest to the average user.
Make no mistake, there were always lots of references to where semantic confusion was problematic but that isn’t the same as starting with problems a user is likely to encounter.
The examples and literature Max points to makes me interested in started with concrete problems topic maps are good at solving and then introducing topic map concepts as necessary.
From the post:
There are lots of puzzle programming tutorials currently in fashion: Code.org, Gidget and Parson’s programming puzzles. But, we don’t really know if they work? There is work  that shows that completion exercises do work well, but what about puzzles? That is what Kyle wants to find out.
Felienne is live blogging presentations from VL/HCC 2015 IEEE Symposium on Visual Languages and Human-Centric.
The post is quick read and should generate interest in both programming completion puzzles as well as similar puzzles for authoring topic maps.
There is a pre-print: Enabling Independent Learning of Programming Concepts through Programming Completion Puzzles.
Before you question the results based on the sample size, 27 students, realize that is 27 more test subjects than a database project to replace all the outward services for 5K+ users. Fortunately, very fortunately, a group was able to convince management to tank the entire project. Quite a nightmare and slur on “agile development.”
The lesson here is that puzzles are useful and some test subjects are better than no test subjects at all.
Suggestions for topic map puzzles?
From the webpage:
Get 1150 free online courses from the world’s leading universities — Stanford, Yale, MIT, Harvard, Berkeley, Oxford and more. You can download these audio & video courses (often from iTunes, YouTube, or university web sites) straight to your computer or mp3 player. Over 30,000 hours of free audio & video lectures, await you now.
An ever improving resource!
As of last January (2015), it listed 1100 courses.
Another fifty courses have been added and I discovered a course in Hittite!
The same problem with collating content across resources that I mentioned for data science books, obtains here as you take courses in the same discipline or read primary/secondary literature.
What if I find references that are helpful in the Hittite course in the image PDFs of the Chicago Assyrian Dictionary? How do I combine that with the information from the Hittite course so if you take Hittite, you don’t have to duplicate my search?
That’s the ticket isn’t it? Not having different users performing the same task over and over again? One user finds the answer and for all other users, it is simply “there.”
Quite a different view of the world of information than the repetitive, non-productive, ad-laden and often irrelevant results from the typical search engine.
From the webpage:
The Law Library, Davis Library and the Sonja Haynes Stone Center have just purchased rich digital collections of NAACP, federal government and other organization documents. The collections illuminate the African American struggle to attain equal rights after Reconstruction. Collections span the 1870s to the 1980s. The collections are:
- Black Freedom Struggle in the 20th Century: Federal Government Records
- Black Freedom Struggle in the 20th Century: Organizational Records and Personal Papers
They supplement current UNC collections of NAACP documents and complement another new collection documenting earlier struggles, Slavery & the Law, and the existing Southern Life and African American History, 1715-1915, Plantation Records. Slavery and the Law features petitions on race, slavery, and free blacks that were submitted to state legislatures and county courthouses between 1775 and 1867.
The collections are in ProQuest’s History Vault Collection. For more information, contact a law librarian at 919-962-1194.
I rather doubt that the UNC Law Library has purchased these collections but rather has secured access to members of its faculty and student body to these materials. Hence the access via the ProQuest History Vault Collection.
Like any good massa, ProQuest is going to make a return on its investment, even if that excludes black Americans, indeed, all Americans, from learning the history of race in American from primary sources. Or at least those members of the population who don’t have institutional access to the Proquest History Vault Collection.
What makes this particularly galling in this case is that the materials represent a history of struggling for freedom, a story that should be widely told. A story that is being suppressed as it were in the name of our current IP model in the United States.
If we are confined to the artifices of commercial exploitation currently in place, why doesn’t Congress, which has wasted $billions on aircraft that exhibit spontaneous combustion (long rumored about people but confirmed in the F-35), site license this resource for everyone in the United States?
That would eliminate the paperwork for every institution that wants to access this material, eliminate the paperwork for all those contracts for ProQuest, make the original sources of our racial history available to every person located in the United States, so where is the downside?
While we work on changing the pernicious and exploitative IP regime of the present day, let’s change the rules on site licensing and let the greed of ProQuest lead it into doing the right thing. I care nothing for their motives, so long as universal access is the result.
From the call:
Open Data is invaluable to support researchers, but we contend that open datasets used as Open Educational Resources (OER) can also be invaluable asset for teaching and learning. The use of real datasets can enable a series of opportunities for students to collaborate across disciplines, to apply quantitative and qualitative methods, to understand good practices in data retrieval, collection and analysis, to participate in research-based learning activities which develop independent research, teamwork, critical and citizenship skills. (For more detail please see: http://education.okfn.org/the-21st-centurys-raw-material-using-open-data-as-open-educational-resources)
We are inviting individuals and teams to submit case studies describing experiences in the use of open data as open educational resources. Proposals are open to everyone who would like to promote good practices in pedagogical uses of open data in an educational context. The selected case studies will be published in a open e-book (CC_BY_NC_SA) hosted by Open Knowledge Foundation Open Education Group http://education.okfn.org by mid September 2015.
Participation in the call requires the submission of a short proposal describing the case study (of around 500 words), all proposal must be written in English, however, the selected authors will have the opportunity to submit the case both in English and another language, as our aim is to support the adoption of good practices in the use of open data in different countries.
- Deadline for submission of proposals (approx. 500 words): 28th June
- Notification to accepted proposals: 5th of July
- Draft case study submitted for review (1500 – 2000 words): 26th of July
- Publication-ready deadline: 16th of August
- Publication date: September 2015
If you have any questions or comments please contact us by filling the “contact the editors” box at the end of this form
Use of open data implies a readiness to further the use of open data. One way to honor that implied obligation is to share with others your successes and just as importantly, any failures in the use of open data in an educational context.
All too often we hear only a steady stream of success stories and we wonder where others drew such perfect students, assistants, and clean data that underlies their success. Never realizing that their students, assistants and data are no better and no worse than ours. The regular mis-steps, false starts, outright wrong paths are omitted in the story telling. For times’ sake no doubt.
If you can, do participate in this effort, even if you only have a success story to relate. 😉
Treadstone 71 continues to act as an unpaid (so far as I know) advertising agent for Sharif University.
From the university homepage:
Sharif University of Technology is one of the largest engineering schools in the Islamic Republic of Iran. It was established in 1966 under the name of Aryarmehr University of Technology and, at that time, there were 54 faculty members and a total of 412 students who were selected by national examination. In 1980, the university was renamed Sharif University of Technology. SUT now has a total of 300 full-time faculty members, approximately 430 part-time faculty members and a student body of about 12,000.
In part Treadstone 71 comments:
There are many documents available on honeypot detection. Not too many are found as a Master’s course at University levels. Sharif University as part of the Iranian institutionalized efforts to build a cyber warfare capability for the government in conjunction with AmnPardaz, Ashiyane, and shadowy groups such as Ajax and the Iranian Cyber Army is highly focused on such an endeavor. With funding coming from the IRGC, infiltration of classes and as members of academia with Basij members, Sharif University is the main driver of information security and cyber operations in Iran. Below is another of many such examples. Honeypots and how to detect them is available for your review.
It is difficult to find a Master’s degree in CS that doesn’t include coursework on network security in general and honeypots in particular. I spot checked some of the degree’s offered by schools listed at: Best Online Master’s Degrees in Computer Science and found no shortage of information on honeypots.
I recognize the domestic (U.S.) political hysteria surrounding Iran but security decisions based on rumor and unfounded fears aren’t the best ones.
From the webpage:
Build a modern computer system, starting from first principles. The course consists of six weekly hands-on projects that take you from constructing elementary logic gates all the way to building a fully functioning general purpose computer. In the process, you will learn — in the most direct and intimate way — how computers work, and how they are designed.
This course is a fascinating 7-week voyage of discovery in which you will go all the way from Boolean algebra and elementary logic gates to building a central processing unit, a memory system, and a hardware platform, leading up to a general-purpose computer that can run any program that you fancy. In the process of building this computer you will become familiar with many important hardware abstractions, and you will implement them, hands on. But most of all, you will enjoy the tremendous thrill of building a complex and useful system from the ground up.
You will build all the hardware modules on your home computer, using a Hardware Description Language (HDL), learned in the course, and a hardware simulator, supplied by us. A hardware simulator is a software system that enables building and simulating gates and chips before actually committing them to silicon. This is exactly what hardware engineers do in practice: they build and test computers in simulation, using HDL and hardware simulators.
Do you trust locks?
Do you know how locks work?
I don’t and yet I trust locks to work. But then a lock requires physical presence to be opened and locks do have a history of defeating attempts to unlock them without the key. Not always but a high percentage of the time.
Do you trust computers?
Do you know how computers work?
I don’t, not really. Not at the level of silicon.
So why would I trust computers? We know computers are as faithful as a napkin at a party and have no history of being secure, for anyone.
Necessity seems like a weak answer doesn’t it? Trusting computers to be insecure seems like a better answer.
Not that everyone wants or needs to delve into computers at the level of silicon but exposure to the topic doesn’t hurt.
Might even help when you hear of hardware hacks like rowhammer. You don’t really think that is the last of the hardware hacks do you? Seriously?
BTW, I first read about this course in the Clojure Gazette, which is a great read, whether you are a Clojure programmer or not. Take a look and consider subscribing. Another reason to subscribe is that it lists a smail address of New Orleans, Louisiana.
Even the fast food places have good food in New Orleans. The non-fast food has to be experienced. Words are not enough. It would be like trying to describe sex to someone who has only read about it. Just not the same. Every conference should be in New Orleans every two or three years.
After you get through day-dreaming about New Orleans, go ahead and register for From Nand to Tetris / Part I April 11 – June 7 2015
From the post:
In the words of Alex Szalay, these sorts of researchers must be “Pi-shaped” as opposed to the more traditional “T-shaped” researcher. In Szalay’s view, a classic PhD program generates T-shaped researchers: scientists with wide-but-shallow general knowledge, but deep skill and expertise in one particular area. The new breed of scientific researchers, the data scientists, must be Pi-shaped: that is, they maintain the same wide breadth, but push deeper both in their own subject area and in the statistical or computational methods that help drive modern research:
Perhaps neither of these labels or descriptions is quite right. Another school of thought on data science is Jim Gray’s idea of the “Fourth Paradigm” of scientific discovery: First came the observational insights of empirical science; second were the mathematically-driven insights of theoretical science; third were the simulation-driven insights of computational science. The fourth paradigm involves primarily data-driven insights of modern scientific research. Perhaps just as the scientific method morphed and grew through each of the previous paradigmatic transitions, so should the scientific method across all disciplines be modified again for this new data-driven realm of knowledge.
Neither one of the labels in the graphic are correct. In part because this a classic light versus dark dualism, along the lines of Middle Age scholars making reference to the dark ages. You could not have asked anyone living between the 6th and 13th centuries, what it felt like to live in the “dark ages.” That was a name later invented to distinguish the “dark ages,” an invention that came about in the “Middle Ages.” The “Middle Ages” being coined, of course, during the Renaissance.
Every age thinks it is superior to those that came before and the same is true for changes in the humanities and sciences. Fear not, someday your descendants will wonder how we fed ourselves, being hobbled with such vastly inferior software and hardware.
I mention this because the “Pi-shaped” graphic is making the rounds on Twitter. It is only one of any number of new “distinctions” that are springing up in academia and elsewhere. None of which will be of interest or perhaps even intelligible in another twenty years.
Rather than focusing on creating ephemeral labels for ourselves and others, how about we focus on research and results, whatever label has been attached to someone? Yes?
Clare recites all the numbing stats on the coming shortage of data scientists but then takes a turn that most don’t.
Clare outlines a masters of data science curriculum using free resources for the most part on the Web.
Will you help reduce the coming shortage of data scientists?
From the post:
Explains the basic concepts of Category Theory, useful terminology to help understand the literature, and why it’s so relevant to software engineering.
Some two hundred and nine (209) slides, ending with pointers to other resources.
I would have dearly loved to see the presentation live!
This slide deck comes as close as any I have seen to teaching category theory as you would a natural language. Not too close but closer than others.
Think about it. When you entered school did the teacher begin with the terminology of grammar and how rules of grammar fit together?
Or, did the teacher start you off with “See Jack run.” or its equivalent in your language?
You were well on your way to being a competent language user before you were tasked with learning the rules for that language.
Interesting that the exact opposite approach is taken with category theory and so many topics related to computer science.
Pointers to anyone using a natural language teaching approach for category theory or CS material?
The Revolution in Astronomy Education: Data Science for the Masses
by Kirk D. Borne, et al.
As our capacity to study ever-expanding domains of our science has increased (including the time domain, non-electromagnetic phenomena, magnetized plasmas, and numerous sky surveys in multiple wavebands with broad spatial coverage and unprecedented depths), so have the horizons of our understanding of the Universe been similarly expanding. This expansion is coupled to the exponential data deluge from multiple sky surveys, which have grown from gigabytes into terabytes during the past decade, and will grow from terabytes into Petabytes (even hundreds of Petabytes) in the next decade. With this increased vastness of information, there is a growing gap between our awareness of that information and our understanding of it. Training the next generation in the fine art of deriving intelligent understanding from data is needed for the success of sciences, communities, projects, agencies, businesses, and economies. This is true for both specialists (scientists) and non-specialists (everyone else: the public, educators and students, workforce). Specialists must learn and apply new data science research techniques in order to advance our understanding of the Universe. Non-specialists require information literacy skills as productive members of the 21st century workforce, integrating foundational skills for lifelong learning in a world increasingly dominated by data. We address the impact of the emerging discipline of data science on astronomy education within two contexts: formal education and lifelong learners.
Kirk Borne posted a tweet today about this paper with following graphic:
I deeply admire the work that Kirk has done, is doing and hopefully will continue to do, but is the answer really that simple? That is we need to provide people with “…great tools written by data scientists?”
As an example of what drives my uncertainty, I saw a presentation a number of years ago in biblical studies that involved statistical analysis and when the speaker was asked by a particular result was significant, the response was the manual said that it was. Ouch!
On the other hand, it may be that like automobiles, we have to accept a certain level of accidents/injuries/deaths as a cost of making such tools widely available.
Should we acknowledge up front that a certain level of mis-use, poor use, inappropriate use of “great tools written by data scientists” is a cost of making data and data tools available?
PS: I am leaving to one side cases where tools have been deliberately fashioned to reach false or incorrect results. Detecting those cases might challenge seasoned data scientists.
From the post:
Singularity University (SU), the technology-focused education institute and global business accelerator has announced a new multi-million dollar agreement with Google aimed at breaking down barriers to technology innovation by creating opportunities for a more diverse group of entrepreneurs from around the world.
Through the agreement, Google will provide $1.5 million annually for the next two years to help fund qualified and selected candidates to SU’s flagship Graduate Studies Program (GSP) – a 10-week immersive experience that educates and empowers the best minds to use exponential technologies to solve the world’s greatest challenges. While SU’s sponsored Global Impact Competitions (GIC) winners will continue to comprise a substantial portion of the GSP class, the new Google funding will enable SU to also make the remaining seats in the program available free of charge to direct applicants. GSP participants are engaged in twelve tracks of exponential technology development and mentored by leaders and investors in the technology sector with the focus of abating poverty and creating innovative solutions in the areas of clean energy, water, education, security, and healthcare.
Applications are now open for the 2015 Graduate Studies Program through SU’s Direct Admission online application: http://apply2015.singularityu.org/
Recently MapR made Hadoop training and certification available for free and now Google is supporting Singularity University to make it free as well.
A marked contrast to state supported colleges and universities where tuition continues to rise faster than inflation. Not to mention educational loans, which are made at no risk to lenders, continue to burden students for years after graduation.
What does the “free market” know about the return on education that the “public sector” seems to have forgotten?
Rather than investing $trillions in the pursuit of terrorist bogeymen, paying off all student debt and making higher education free for everyone would be a much better investment.
MapR Offers Free Hadoop Training and Certifications by Thor Olavsrud.
From the post:
In an effort to make Hadoop training for developers, analysts and administrators more accessible, Hadoop distribution specialist MapR Technologies Tuesday unveiled a free on-demand training program. Another track for HBase developers will be added later this quarter.
“This represents a $50 million, in-kind contribution to the Hadoop community,” says Jack Norris, CMO of MapR. “The focus is overcoming what many people consider the major obstacle to the adoption of big data, particularly Hadoop.”
The developer track is about building big data applications in Hadoop. The topics range from the basics of Hadoop and related technologies to advanced topics like designing and developing MapReduce and HBase applications with hands-on labs. The courses include:
- Hadoop Essentials. This course, which is immediately available, provides an introduction to Hadoop, the ecosystem, common solutions and use cases.
- Developing Hadoop Applications. This course is also immediately available and focuses on designing and writing effective Hadoop applications with MapReduce and YARN.
- HBase Schema Design and Modeling. This course will become available in February and will focus on architecture, schema design and data modeling on HBase.
- Developing HBase Applications. This course will also debut in February and focuses on real-world application design in HBase (Time Series and Social Application examples).
- Hadoop Data Analysis – Drill. Slated for debut in March, this course covers interactive SQL on Hadoop for structured, semi-structured and nested data.
I remember how expensive the Novell training classes were back in the Netware 4.11 days. (Yes, that has been a while.)
I wonder whose software will come to mind after completing the MapR training courses and passing the certification exams?
That’s what I think too. Send kudos to MapR for this effort!
Looking forward to seeing some of you at Hadoop certification exams later this year!
I first saw this in a tweet by Kirk Borne.
From the course page:
Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it. Many researchers also think it is the best way to make progress towards human-level AI. In this class, you will learn about the most effective machine learning techniques, and gain practice implementing them and getting them to work for yourself. More importantly, you’ll learn about not only the theoretical underpinnings of learning, but also gain the practical know-how needed to quickly and powerfully apply these techniques to new problems. Finally, you’ll learn about some of Silicon Valley’s best practices in innovation as it pertains to machine learning and AI.
This course provides a broad introduction to machine learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning (parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii) Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in machine learning and AI). The course will also draw from numerous case studies and applications, so that you’ll also learn how to apply learning algorithms to building smart robots (perception, control), text understanding (web search, anti-spam), computer vision, medical informatics, audio, database mining, and other areas.
I could have just posted Machine Learning, Andrew Ng and 19 Jan. but there are people who have heard of this course before. Hard to believe but I have been assured that is in fact the case.
So the prose stuff is for them. Why are you reading this far? Go register for the course!
I have heard rumors the first course had an enrollment of over 100,000! I wonder if this course will break current records?
From the post:
While you were eating turkey, we were busy rummaging around the internet and adding new courses to our big list of Free Online Courses, which now features 1,100 courses from top universities. Let’s give you the quick overview: The list lets you download audio & video lectures from schools like Stanford, Yale, MIT, Oxford and Harvard. Generally, the courses can be accessed via YouTube, iTunes or university web sites, and you can listen to the lectures anytime, anywhere, on your computer or smart phone. We didn’t do a precise calculation, but there’s probably about 33,000 hours of free audio & video lectures here. Enough to keep you busy for a very long time.
Right now you’ll find 127 free philosophy courses, 82 free history courses, 116 free computer science courses, 64 free physics courses and 55 Free Literature Courses in the collection, and that’s just beginning to scratch the surface. You can peruse sections covering Astronomy, Biology, Business, Chemistry, Economics, Engineering, Math, Political Science, Psychology and Religion.
OpenCulture has gathered up a large variety of materials.
In the meantime, there are a number of other course selections to enjoy!
From the homepage:
The Georgia Institute of Technology, Udacity and AT&T have teamed up to offer the first accredited Master of Science in Computer Science that students can earn exclusively through the Massive Open Online Course (MOOC) delivery format and for a fraction of the cost of traditional, on-campus programs.
This collaboration—informally dubbed “OMS CS” to account for the new delivery method—brings together leaders in education, MOOCs and industry to apply the disruptive power of massively open online teaching to widen the pipeline of high-quality, educated talent needed in computer science fields.
Whether you are a current or prospective computing student, a working professional or simply someone who wants to learn more about the revolutionary program, we encourage you to explore the Georgia Tech OMS CS: the best computing education in the world, now available to the world.
A little more than a year old, the Georgia Tech OMS CS program continues to grow. Carl Straumsheim writes in One Down, Many to Go of high marks for the program by students and administrators feeling their way along in this exercise in delivery of education.
At an estimated cost of less than $7,000 for a Master of Science in Computer Science, this program has the potential to change the complexion of higher education in computer science at least.
How many years (decades?) it will take for this delivery model to trickle down to the humanities is uncertain. Acknowledging that J.J. O’Donnell made waves in 2004 by teaching Augustine: the Seminar to a global audience. There has been no rush of humanities scholars to follow his example.
From the about page:
XPERT (Xerte Public E-learning ReposiTory) project is a JISC funded rapid innovation project (summer 2009) to explore the potential of delivering and supporting a distributed repository of e-learning resources created and seamlessly published through the open source e-learning development tool called Xerte Online Toolkits. The aim of XPERT is to progress the vision of a distributed architecture of e-learning resources for sharing and re-use.
Learners and educators can use XPERT to search a growing database of open learning resources suitable for students at all levels of study in a wide range of different subjects.
Creators of learning resources can also contribute to XPERT via RSS feeds created seamlessly through local installations of Xerte Online Toolkits. Xpert has been fully integrated into Xerte Online Toolkits, an open source content authoring tool from The University of Nottingham.
Other useful links:
The Google interface is “stark” in the same sense but Google has indexed a substantial portion of all online content. I’m not very likely to draw a blank. Xpert, with a base of 364,979 resources, the odds of my drawing a blank are far higher.
The keywords are in three distinct alphabetical segments, starting with “a” or a digit, ending and then another digit or “a” follows and end, one after the other. Hebrew and what appears to be Chinese appears at the end of the keyword list, in no particular order. I don’t know if that is an artifact of the software or of its use.
The same repeated alphabetical segments occurs in Author. Under Type there are some true types such as “color print” but the majority of the listing is file sizes in bytes. Not sure why file size would be a “type.” Institution has similar issues.
If you are looking for a volunteer opportunity, helping XPert with alphabetization would enhance the browsing experience for the resources it has collected.
I first saw this in a tweet by Graham Steel.