Paper-reading log

Notes on interesting papers, to remind myself why I found them interesting, and to help find them again.

Primarily "blue-skies" research reading, not papers read in the course of writing one of my own, so not necessarily representative of my research.

2018-06-07 – The abstract organism: Towards a prehistory for a-life art

2018-05-30 – Measuring the intrinsic dimension of objective landscapes

2018-05-24 – A trans-Canada computer communications network

2018-05-08 – Vagueness, rationality and undecidability: A theory of why there is vagueness

2018-05-02 – Practical reinforcement learning in continuous spaces

2018-04-29 – Detection of a cosmic ray with measured energy well beyond the expected spectral cutoff due to cosmic microwave radiation

2018-04-28 – Six mathematical gems from the history of distance geometry

2018-04-10 – Theorizing affordances: From request to refuse

2018-04-05 – How to do research at the MIT AI Lab

2018-03-30 – Extremely randomized trees

2018-03-28 – Perspectives of approximate dynamic programming

2018-03-27 – Datafun: A functional Datalog

2018-03-26 – Finding alternative musical scales

2018-02-26 – Seagull: A bird's-eye view of the evolution of technical games research

2018-02-14 – Propp's morphology of the folk tale as a grammar for generation

2017-11-16 – Wordless games: Gameplay as narrative technique

2017-10-27 – On the notion of interestingness in automated mathematical discovery

2017-10-13 – Why most decisions are easy in Tetris—and perhaps in other sequential decision problems, as well

2017-09-12 – The ordinal nature of emotions

2017-08-26 – Grimes' fairy tales: A 1960s story generator

2017-08-25 – Some natural solutions to the p-value communication problem—and why they won't work

2017-07-27 – Twenty things to do with a computer

2017-07-20 – Computer of a thousand faces: Anthropomorphizations of the computer in design (1965-1975)

2017-05-30 – Propositional & analogical generation of coordinated verbal, visual & musical texts

2016-12-02 – Classification accuracy is not enough: On the evaluation of music genre recognition systems


2018-06-07
The abstract organism: Towards a prehistory for a-life art
M. Whitelaw. Leonardo, 2001.
Tags: alife, electronic art, art history

Abstract:

The author examines historical precedents for contemporary art practice using artificial life, in particular in the work of Paul Klee and Kasimir Malevich. Similarities are identified between artificial life and the philosophical tradition of organicism; specific examples from Klee and Malevich indicate that those artists were engaged in a form of creative organicist thought that imagined the realization of living structures in artificial media.

An interesting "prehistory". One can always find precursors to artistic movements, but this one seems like it might have some depth (only touched on in the brief article). By way of motivation, the paper argues that although a-life historically grew out of computing (in large part artificial intelligence) and views itself as future-oriented, this gives it a kind of ahistoricism that disconnects it more than warranted from ideas in the history of art. The specific point of connection is built around Langton's view that a-life is about the organization of matter in an artificial medium, which is then related to Klee and Malevich's interest in formal internal organization and dynamics. Klee is read as "exploring a kind of artificial physics for the picture plane — or more, an artificial cosmogeny", due to building a formal pictoral language abstracted from dynamics in nature. Evocative, but maybe a bit of a stretch? The connection to Malevich seems much more solid, reading Suprematism's interest in building a new "non-objective" reality from the internal dynamics of form as related to a-life's goals. Malevich's rhetoric around "technical organisms" definitely sounds a bit a-lifey, although it comes in his writings that are pretty sci-fi (the "Suprematist machine" will leave earth, "not by means of engines... but through the smooth harnessing of form").

2018-05-30
Measuring the intrinsic dimension of objective landscapes
C. Li, H. Farkhoor, R. Liu, J. Yosinski. Intl. Conf. Learning Representations, 2018.
Tags: machine learning

Abstract:

Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. We slowly increase the dimension of this subspace, note at which dimension solutions first appear, and define this to be the intrinsic dimension of the objective landscape. The approach is simple to implement, computationally tractable, and produces several suggestive conclusions. Many problems have smaller intrinsic dimensions than one might suspect, and the intrinsic dimension for a given dataset varies little across a family of models with vastly different sizes. This latter result has the profound implication that once a parameter space is large enough to solve a problem, extra parameters serve directly to increase the dimensionality of the solution manifold. Intrinsic dimension allows some quantitative comparison of problem difficulty across supervised, reinforcement, and other types of learning where we conclude, for example, that solving the inverted pendulum problem is 100 times easier than classifying digits from MNIST, and playing Atari Pong from pixels is about as hard as classifying CIFAR-10. In addition to providing new cartography of the objective landscapes wandered by parameterized models, the method is a simple technique for constructively obtaining an upper bound on the minimum description length of a solution. A byproduct of this construction is a simple approach for compressing networks, in some cases by more than 100 times.

An intriguing way to quantify complexity of solutions that's designed to avoid some of the problems of raw parameter counting, which has longstanding usage for regularization but seems to not necessarily work well as a metric when applied to neural networks with tons of parameters. Although they refer to this as "difficulty" of problems sometimes, it's very tied up with a specific learning algorithm, so is really difficulty of a particular problem *for* a particular neural-network architecture and training algorithm. They acknowledge this in the paper, though, and that learner-specificity has pros and cons. The method is pretty simple, if a bit computationally intensive: restrict neural nets to learning in random subspaces of their usual parameter spaces, but otherwise use the same learning algorithm they would normally use. For example, starting with a 1-dimensional subspace is equivalent to doing line search in random directions. Increase the dimension of this subspace until the network can successfully learn a solution; call that dimension at which a solution appears the intrinsic dimension of the problem (for this learner). As they note, there are many possible applications of such a measure.

2018-05-24
A trans-Canada computer communications network
O.M. Solandt et al. Science Council of Canada Report, 1971.
Tags: networking, politics

Excerpt:

In the sector of service to users, there is one Major Project which would dramatically improve the quality and quantity of computer services available to all Canadians and would greatly assist Canadian industry. It is the organization of a nationwide system of computer communications networks. To achieve this goal a positive, vigorous Canadian policy is urgently needed. A "laissez-faire" attitude will eventually result in the supply of most computing and information services via spur lines from U.S. computer commu­ nications networks. Such an outcome is completely unacceptable on economic and social grounds. [...] The system of networks should not be allowed to practise "cream-skimming" by concentrating exclusively on the densely populated, highly profitable regions of Canada. It must link all important centres in Canada in order to bring computing and information services to the greatest possible number of Canadians.

An early report (via Daniel Joseph) worried about the impact on Canada of the then rapidly developing proto-internet networks, recommending a strong Canadian investment in a national network lest major Canadian cities simply become spokes on a US-controlled network with little scientific, political, or economic role in the network's development. Mostly interesting to me because 1971 seems like a fairly early date for a number of the topics discussed; ArpaNet had just been formed in 1969, but its implications were apparently already clear to Canadian scientists. Among other things: they worry that given Canada's geography, economic relationship to the U.S., and the status of ArpaNet as a military/private/university partnership of exclusively American institutions, Canadian universities and companies will be cut out of a significant role in modern networking; that "the Canadian computer service industry is vulnerable to the dumping of surplus computer capacity from abroad"; that all data between Canadian cities would end up routed through the U.S.; and that future uses of the network for things such as hospital records and telemedicine (!) would be hampered without nationwide Canadian reach. Also some specific numbers and diagrams in support of these claims/recommendations.

2018-05-08
Vagueness, rationality and undecidability: A theory of why there is vagueness
M.A. Changizi. Synthese, 1999.
Tags: natural language, formal languages, logic

Excerpt:

Vagueness is not undecidability, but undecidability does enter into an explanation of why there is vagueness. My theory, called the Undecidability Theory of Vagueness, explains vagueness largely as a result of the fact that we are computationally bound. Vagueness is not due to any particularly human weakness, but due to a weakness that any computationally bound agent possesses; even HAL from 2001: A Space Odyssey will probably experience vagueness. Furthermore, I will argue that vagueness is good for you. I will do so by showing that if you were a computationally bound, rational alien agent given the task of figuring out what our natural language predicates mean, you would very probably end up with vagueness. That is, unless two highly plausible hypotheses are false, it would be rational for you—i.e., in your best interest—to choose concepts in such a way that they are vague. Given that you are computationally bound, avoiding vagueness brings in greater costs than accepting it.

This paper argues that vagueness in natural language can be explained purely as a property of formal systems. That's intriguing because vagueness is often taken as evidence against formal models being a good representation of language, given the seeming mismatch between the precise semantics of logic and the vagueness of real human languages. Thus, alternatives are proposed, such as: humans must use some kind of more "subsymbolic" or "fuzzy" modes of computation, or perhaps the computational model of language/cognition is fundamentally wrong. Natural language and human cognition aren't my area, but the rest of this argument is interesting regardless of whether it empirically explains human language or not. It argues that vagueness is a natural property of formal systems that ought to arise given three conditions: 1) the system is no more powerful than the Church–Turing thesis allows; 2) predicate meaning is extensionally determined by execution of an algorithm; and 3) such algorithms can be chosen from the space of all possible algorithms. The paper further argues that this situation is in a sense in an agent's "best interest", specifically that an agent meeting condition (1) can only avoid vagueness by giving up either (2) or (3), which would be a worse cure than the disease of vagueness. There are some intermediate concepts and technical results developed to get to this conclusion that I haven't yet read closely enough to follow.

2018-05-02
Practical reinforcement learning in continuous spaces
W.D. Smart, L.P. Kaelbling. Intl. Conf. Machine Learning, 2000.
Tags: machine learning, reinforcement learning

Abstract:

Dynamic control tasks are good candidates for the application of reinforcement learning techniques. However, many of these tasks inherently have continuous state or action variables. This can cause problems for traditional reinforcement learning algorithms which assume discrete states and actions. In this paper, we introduce an algorithm that safely approximates the value function for continuous state control tasks, and that learns quickly from a small amount of data. We give experimental results using this algorithm to learn policies for both a simulated task and also for a real robot, operating in an unaltered environment. The algorithm works well in a traditional learning setting, and demonstrates extremely good learning when bootstrapped with a small amount of human-provided data.

I was prompted to revisit earlier work on function approximators in RL by a recent blog post, "Deep Reinforcement Learning Doesn't Work Yet", which discusses instability of deep RL. Earlier work is much more suspicious of the soundness of sticking function approximators into RL algorithms in the first place. Section 2 of this paper has a nice summary of some of the reasons that "using value function approximation with Q-learning is not simply a case of substituting in a supervised learning algorithm for the Q-value lookup table". Specifically, the Q-values are iteratively estimated, so feedback effects cause errors to compound, unlike in supervised learning (this seems like it's probably part of the explanation for the extreme run-to-run instability seen in deep RL). The paper ties this to the view that function approximators can really only interpolate, not extrapolate, but may end up doing "hidden extrapolation" when put into more complex settings like approximating inside an RL loop. And, because of the recursive nature, post-hoc fixes like cross-validating to estimate error in different regions don't work either. Since we nonetheless need function approximators to do RL in large state spaces, the paper proposes an algorithm, Hedger, designed to address all that and allow "safe" Q-value approximation. It would be interesting to know whether any insights from the algorithm are relevant to more recent deep-RL style architectures.

2018-04-29
Detection of a cosmic ray with measured energy well beyond the expected spectral cutoff due to cosmic microwave radiation
D.J. Bird et al. Astrophysical J., 1995.
Tags: physics

Abstract:

We report the detection of a 51-joule (320 +/- 90 EeV) cosmic ray by the Fly's Eye air shower detector in Utah. This is substantially greater than the energy of any previously reported cosmic ray. A Greisen-Zatsepin-Kuz'min cutoff of the energy spectrum (due to pion photoproduction energy losses) should occur below this energy unless the highest energy cosmic rays have traveled less than about 30 Mpc. The error box for the arrival direction in galactic coordinates is centered on b=9.6 deg, l=163.4 deg. The particle cascade reached a maximum size near a depth of 815 g/cm^2 in the atmosphere, a depth which does not uniquely identify the type of primary particle.

The writeup of 1991's "oh-my-god particle", an unusually high-energy cosmic ray detected by a facility in Utah. Obviously this is not my research area at all. I'd heard of it before, and someone linked the paper on Twitter, so I read it. It's surprisingly easy to follow, considering that my physics education consists of two undergrad courses and some Wikipedia reading. The gist of why this is interesting, besides just that it had unusually high energy, is that cosmic background radiation won't allow extremely high energy particles to pass unattenuated. The threshold is known as the GZK limit, said in the paper to be somewhere below 100 EeV, apparently now known to be around 50 EeV. The observed particle was around 320 EeV, so must have originated nearby (barring the GZK limit being wrong); the paper estimates within 30 Mpc. That, plus an approximate origin direction, gives some idea for where to look for the object that produced the particle. The paper also includes a discussion of possible sources of error, alternative hypotheses, etc., since this kind of detector is not a completely straightforward device. That discussion might make for an interesting class discussion paper in the right context: it is just nontrivial enough to be interesting, but I think can be followed by an advanced undergraduate who has had an introduction to error propagation and data interpretation.

2018-04-28
Six mathematical gems from the history of distance geometry
L. Liberti, C. Lavor. Intl. Trans. Operational Research, 2016.
Tags: mathematics, distance geometry, history

Abstract:

This is a partial account of the fascinating history of distance geometry. We make no claim to completeness, but we do promise a dazzling display of beautiful, elementary mathematics. We prove Heron's formula, Cauchy's theorem on the rigidity of polyhedra, Cayley's generalization of Heron's formula to higher dimensions, Menger's characterization of abstract semimetric spaces, a result of Gödel on metric spaces on the sphere, and Schoenberg's equivalence of distance and positive semidefinite matrices, which is at the basis of multidimensional scaling.

Delivers what it promises: "a partial account of the fascinating history of distance geometry". A lively read full of historical anecdotes and asides (but also some proofs). Ran across this paper because I might have some use for distance geometry and was trying to get up to speed on some of its tools/results. It's not really an overview in that respect, but does have readable summaries of a few major results. The aside in Figure 1 that recounts an anecdote from when Kurt Gödel first presented his incompleteness theorem is also funny, although not about distance geometry ("That is very interesting. You should publish that." commented an audience member after a long pause).

2018-04-10
Theorizing affordances: From request to refuse
J.L. Davis, J.B. Chouinard. Bulletin of Science, Technology & Society, 2016.
Tags: human-computer interaction

Abstract:

As a concept, affordance is integral to scholarly analysis across multiple fields—including media studies, science and technology studies, communication studies, ecological psychology, and design studies among others. Critics, however, rightly point to the following shortcomings: definitional confusion, a false binary in which artifacts either afford or do not, and failure to account for diverse subject-artifact relations. Addressing these critiques, this article demarcates the mechanisms of affordance—as artifacts request, demand, allow, encourage, discourage, and refuse—which take shape through interrelated conditions: perception, dexterity, and cultural and institutional legitimacy. Together, the mechanisms and conditions constitute a dynamic and structurally situated model that addresses how artifacts afford, for whom and under what circumstances.

Briefly traces the concept of "affordance", which originated in ecological psychology, was popularized in design studies, and has become common in fields such as HCI and STS. Notes that the concept is sometimes attacked as being rather baggy, as it takes on a range of slippery meanings often left vague enough that commenters have wondered whether it clarifies anything. This paper proposes to sharpen the concept by further analyzing the general ability of things to "afford" into six specific mechanisms through which they do so, plus three contextual aspects accounting for how affordances depend on "for whom and under what circumstances" a thing is affording. The six mechanisms: a thing can request, demand, allow, encourage, discourage, and refuse. The three contextual aspects: perception, dexterity, and cultural/institutional legitimacy.

2018-04-05
How to do research at the MIT AI Lab
D. Chapman (ed.). MIT AI Lab Working Paper, 1988.
Tags: artificial intelligence, research

Abstract:

This document presumptuously purports to explain how to do research. We give heuristics that may be useful in picking up the specific skills needed for research (reading, writing, programming) and for understanding and enjoying the process itself (methodology, topic and advisor selection, and emotional factors).

A collectively written document from '80s MIT AI Lab grad students, including some now pretty well-known folks (Agre, Brooks, Forbus, Kaelbling, etc.). Some bits are MIT-specific or dated, e.g. how to get yourself plugged into the then-important network of hardcopy preprint circulation. But some of it is still pretty relevant and worth a read. A good portion of the practical advice about how to give feedback, choose advisors, choose problems, etc. holds up pretty well. Some advice is maybe in fact better than a good portion of more recent things you'll find. The document has a generally refreshing tone promoting thoughtfulness, interdisciplinarity, reading/studying. A nice antidote to narrower, less reflective views of AI research. Lots of great quotable sentences too.

2018-03-30
Extremely randomized trees
P. Geurts, D. Ernst, L. Wehenkel. Machine Learning, 2006.
Tags: machine learning

Abstract:

This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.

The popular machine-learning method of random forests is built on the insight that classic decision trees have high variance, which can be reduced by introducing randomization plus averaging. L. Breiman's proposal of bagged decisions trees, T. Ho's proposal of the random subspace method, and L. Breiman's 2001 proposal of random forests, are aimed at this goal, randomizing the construction of decision trees and then averaging the randomized trees, with the randomization/averaging process reducing variance. The core argument of this paper is: maybe more randomization? Specifically, that more randomization might be both better and faster. Random forests retain a key aspect of classical decision trees, choosing optimal splits on each candidate feature. The optimization portion is computationally expensive, and the cut-point ends up only indirectly randomized by the bootstrap resampling. Why not use the original data deterministically without bootstrapping, but explicitly randomize the cutpoints? The Extra-Trees algorithm does this. One parameter, K, chooses the subset of features to consider. The cut-point on each of the K features is chosen randomly, and then the best of the K splits is chosen. In the limit where K=1, this is a "totally random tree" that doesn't use the labels as training data at all, and falls into the category of memory-based "lazy learner" algorithms such as kNN.

2018-03-28
Perspectives of approximate dynamic programming
W.B. Powell. Annals of Operations Research, 2016.
Tags: machine learning, reinforcement learning

Abstract:

Approximate dynamic programming has evolved, initially independently, within operations research, computer science and the engineering controls community, all searching for practical tools for solving sequential stochastic optimization problems. More so than other communities, operations research continued to develop the theory behind the basic model introduced by Bellman with discrete states and actions, even while authors as early as Bellman himself recognized its limits due to the "curse of dimensionality" inherent in discrete state spaces. In response to these limitations, subcommunities in computer science, control theory and operations research have developed a variety of methods for solving different classes of stochastic, dynamic optimization problems, creating the appearance of a jungle of competing approaches. In this article, we show that there is actually a common theme to these strategies, and underpinning the entire field remains the fundamental algorithmic strategies of value and policy iteration that were first introduced in the 1950's and 60's.

Attempts to draw connections and bring some order to disparate work on decision-making under uncertainty in several different fields. Whether Powell's proposal of "approximate dynamic programming" as *the* unifying framework for decision-making under uncertainty is a good one, I don't know. But I found the paper useful in trying to map out the work done in this area across operations research, control theory, and computer science. My educational background has been entirely from the computer-science perspective on this (i.e., reinforcement learning), so it's useful to get an overview of relevant stuff that's been done in those two other fields. The paper has a nice set of historical references too, especially for the line of research that most closely follows on from Bellman.

2018-03-27
Datafun: A functional Datalog
M. Arntzenius, N.R. Krishnaswami. Intl. Conf. Functional Programming, 2016.
Tags: programming languages, functional programming, logic programming

Abstract:

Datalog may be considered either an unusually powerful query language or a carefully limited logic programming language. Datalog is declarative, expressive, and optimizable, and has been applied successfully in a wide variety of problem domains. However, most use-cases require extending Datalog in an application-specific manner. In this paper we define Datafun, an analogue of Datalog supporting higher-order functional programming. The key idea is to track monotonicity with types.

Datafun seems interesting to try sometime. The parts of this paper that develop a bunch of PLs theory to define Datafun aren't really my field, but the paper also has a pretty good critical design overview of Datalog. Even though it's proposing a new language intended to improve on Datalog, the paper does that by first taking Datalog seriously on its own terms, and does a good job explaining what Datalog is trying to do, and what its strengths and weaknesses are in doing it. The first sentence of the abstract makes a nice pithy summary of Datalog too.

2018-03-26
Finding alternative musical scales
J.N. Hooker. Intl. Conf. Constraint Programming, 2016.
Tags: computer music, constraint solving

Abstract:

We search for alternative musical scales that share the main advantages of classical scales: pitch frequencies that bear simple ratios to each other, and multiple keys based on an underlying chromatic scale with tempered tuning. We conduct the search by formulating a constraint satisfaction problem that is well suited for solution by constraint programming. We find that certain 11-note scales on a 19-note chromatic stand out as superior to all others. These scales enjoy harmonic and structural possibilities that go significantly beyond what is available in classical scales and therefore provide a possible medium for innovative musical composition.

Are there alternative music scales that share key formal properties of the classical ones? A systematic search, framed as a constraint-programming problem, answers yes. Refreshingly, although this is a formal/mathematical paper by an operations-research person and published in a constraint-programming conference, it's aware of and builds on previous work of this kind done by experimental composers (discussed in Sections 2-3).

2018-02-26
Seagull: A bird's-eye view of the evolution of technical games research
T.-H.D. Nguyen, E. Melcer, A. Canossa, K. Isbister, M.S. El-Nasr. Entertainment Computing, 2018.
Tags: scientometrics, games

Abstract:

Within Entertainment Computing, games research has grown to be its own area, with numerous publication venues dedicated to it. As this area evolves, it is fruitful to examine its overall development—which subcommunities and research interests were present from the start, which have come and gone, and which are currently active—to better understand the research community as a whole and where it may proceed. In this paper, we present a data-driven analysis and interactive visualization tool to shed light on how technical domains within the games research field have evolved from 2000 to 2013, based on publication data from over 8000 articles collected from 48 games research venues, including Entertainment Computing, FDG, AIIDE, and DiGRA. The approach we present is descriptive. We first used data mining algorithms to group related papers into clusters of similar research topics and evolve these clusters over time. We then designed an interactive visualization system, named Seagull, comprised of Sankey diagrams that allow us to interactively visualize and examine the transition and coalescing of different clusters across time. We present our descriptive analysis in this paper and also contribute the visualization interface to allow other researchers to examine the data and develop their own analysis.

A big data-mining study on publications in games research. They quantitatively identify 20 major clusters in the field based on keyword co-occurrence: among others, "game design", "virtual worlds", "narrative", "artificial intelligence", and "human-computer interaction". Some plots tracing how these clusters grow and shrink in different venues over time.

2018-02-14
Propp's morphology of the folk tale as a grammar for generation
P. Gervás. Computational Models of Narrative Workshop, 2013.
Tags: story generation

Abstract:

The semi-formal analysis of Russian folk tales carried out by Vladimir Propp has often been used as theoretical background for the automated generation of stories. Its rigour and its exhaustive description of the constituent elements of Russian folk tales, and the enumeration of the patterns they follow, have acted as inspiration for several story generation systems, both sequential and interactive. Yet most of these efforts have attempted to generalize Propp's account to types of stories beyond the corpus that it arose from. In the process, a number of the valuable intuitions present in the original work are lost. The present paper revisits Propp's morphology to build a system that generates instances of Russian folk tales. Propp's view of the folk tale as a rigid sequence of character functions is phrased as a grammar that can be employed as a plot driver. Unification is used to incrementally build a conceptual representation of discourse by adding to an ongoing draft story actions that instantiate the character functions. Story actions are defined by pre and post conditions on the state of the plot to account for the causal relations crucial to narrative. The potential of the resulting system for providing a generic story generation system is discussed and possible lines of future work are discussed.

A paper computationally revisiting Propp's formalist analysis of Russian folk tales as a generative model. Gervás points out that, although Propp's model is well known and often referenced in story generation, when used as a basis for systems it's usually used at a level that's much simpler than Propp's own analysis was. Section 2 has a survey of previous story generators' use of Proppian models, finding them not all that Proppian. Section 3 then proposes a new generative implementation argued to be more faithful to Propp's analysis. [Added 2018-04-21]: The survey in Section 2 does miss Sheldon Klein's 1970s implementation of a fairly detailed Proppian story generator (admittedly somewhat obscure, and only brought to my attention by James Ryan). Would be interesting to compare Gervás's and Klein's implementations/interpretations of Propp.

2017-11-16
Wordless games: Gameplay as narrative technique
Y.T. Sim, A. Mitchell. Intl. Conf. Interactive Digital Storytelling, 2017.
Tags: interactive narrative, games

Abstract:

In this paper, we look at how gameplay can be used to tell stories in a game without the help of words. Through close readings of three wordless games with a strong narrative focus, Journey, Brothers: A Tale of Two Sons and A Bird Story, we explore how gameplay within wordless games can help to convey a narrative. We have identified four techniques by which gameplay is used for storytelling: gameplay as enacting narrative, manipulating player controls for narrative effect, gameplay for exploring narrative setting, and gameplay as time progression. We discuss these techniques in relation to existing concepts of player experience, and suggest ways gameplay can help to circumvent issues of ambiguity in wordless narrative in games.

The category of strongly narrative-driven games that have no dialog or other words is obvious in retrospect, but I hadn't thought of framing it that way. It's more specific than environmental storytelling. Both the identification of this category and pulling out four narrative techniques that it uses are useful contributions of this paper.

2017-10-27
On the notion of interestingness in automated mathematical discovery
S. Colton, A. Bundy, T. Walsh. Intl. J. Human-Computer Studies, 2000.
Tags: discovery systems, mathematics

Abstract:

We survey five mathematical discovery programs by looking in detail at the discovery processes they illustrate and the success they had. We focus on how they estimate the interestingness of concepts and conjectures and extract some common notions about interestingness in automated mathematical discovery. We detail how empirical evidence is used to give plausibility to conjectures, and the different ways in which a result can be thought of as novel. We also look at the ways in which the programs assess how surprising and complex a conjecture statement is, and the different ways in which the applicability of a concept or conjecture is used. Finally, we note how a user can set tasks for the program to achieve and how this affects the calculation of interestingness. We conclude with some hints on the use of interestingness measures for future developers of discovery programs in mathematics.

A critical overview of how five systems—AM (Lenat), GT (Epstein), Graffiti (Fajtlowicz), Bagai et al's system, and HR (Colton et al)—conceive of "interestingness" when going about trying to discover "interesting" mathematical concepts or conjectures. Also a useful concise overview of what these five systems in general are doing, with some summary tables listing the mathematical domains they work in, how they represent concepts, etc.

2017-10-13
Why most decisions are easy in Tetris—and perhaps in other sequential decision problems, as well
Ö. Simsek, S. Algorta, A. Kothiyal. Intl. Conf. Machine Learning, 2016.
Tags: reinforcement learning, games

Abstract:

We examined the sequence of decision problems that are encountered in the game of Tetris and found that most of the problems are easy in the following sense: One can choose well among the available actions without knowing an evaluation function that scores well in the game. This is a consequence of three conditions that are prevalent in the game: simple dominance, cumulative dominance, and noncompensation. These conditions can be exploited to develop faster and more effective learning algorithms. In addition, they allow certain types of domain knowledge to be incorporated with ease into a learning algorithm. Among the sequential decision problems we encounter, it is unlikely that Tetris is unique or rare in having these properties.

Demonstrates three specific reasons that Tetris is much easier than a fully general RL problem (regularities they call simple dominance, cumulative dominance, and noncompensation). Speculates that such problems are in fact quite common, and proposes that learning algorithms could be more efficient if they exploited such properties. How widely this applies to other games is a bit speculative at this stage. Besides Tetris, they have a preliminary discussion of Backgammon in Section 6.1 ("Is Tetris special?"). Some interesting research avenues suggest themselves.

2017-09-12
The ordinal nature of emotions
G.N. Yannakakis, R. Cowie, C. Busso. Intl. Conf. Affective Computing & Intelligent Interaction, 2017.
Tags: affective computing

Abstract:

Representing computationally everyday emotional states is a challenging task and, arguably, one of the most fundamental for affective computing. Standard practice in emotion annotation is to ask humans to assign an absolute value of intensity to each emotional behavior they observe. Psychological theories and evidence from multiple disciplines including neuroscience, economics and artificial intelligence, however, suggest that the task of assigning reference-based (relative) values to subjective notions is better aligned with the underlying representations than assigning absolute values. Evidence also shows that we use reference points, or else anchors, against which we evaluate values such as the emotional state of a stimulus; suggesting again that ordinal labels are a more suitable way to represent emotions. This paper draws together the theoretical reasons to favor relative over absolute labels for representing and annotating emotion, reviewing the literature across several disciplines. We go on to discuss good and bad practices of treating ordinal and other forms of annotation data, and make the case for preference learning methods as the appropriate approach for treating ordinal labels. We finally discuss the advantages of relative annotation with respect to both reliability and validity through a number of case studies in affective computing, and address common objections to the use of ordinal data. Overall, the thesis that emotions are by nature relative is supported by both theoretical arguments and evidence, and opens new horizons for the way emotions are viewed, represented and analyzed computationally.

Sometime around 2011 or 2012, back when I lived in Copenhagen and Georgios lived in Malmö, I visited him there, and on the walk to his place from Triangeln station, heard his rather exasperated-sounding explanation of why everyone (or almost everyone) in computer science is doing emotion processing completely wrong. He simply could not understand why people treated emotions as absolute numerical quantities in the face of all evidence to the contrary, with more or less the only thing in favor of that approach being that it was computationally convenient (maybe reason enough for some). This paper is sort of the culmination in print of that view, although he has published elsewhere some subsidiary aspects of his views on the question (e.g. "Don’t classify ratings of affect; Rank them!", IEEE Trans. Affective Computing, 2014). This paper covers the question from a number of angles, from theoretical views on emotion to technical questions of statistical analysis, and would take more than a note of this length to do it justice, so I'll leave it at that for now.

2017-08-26
Grimes' fairy tales: A 1960s story generator
J. Ryan. Intl. Conf. Interactive Digital Storytelling, 2017.
Tags: story generation, history

Abstract:

We provide the first extensive account of an unknown story generator that was developed by linguist Joseph E. Grimes in the early 1960s. A pioneering system, it was the first to take a grammar-based approach and the first to operationalize Propp’s famous model. This is the opening paper in a series that will aim to reformulate the prevailing history of story generation in light of new findings we have made pertaining to several forgotten early projects. Our study here has been made possible by personal communication with the system's creator, Grimes, and excavation of three obscure contemporaneous sources. While the accepted knowledge in our field is that the earliest story generator was Sheldon Klein’s automatic novel writer, first reported in 1971, we show that Grimes’s system and two others preceded it. In doing this, we reveal a new earliest known system. With this paper, and follow-ups to it that are in progress, we aim to provide a new account of the area of story generation that lends our community insight as to where it came from and where it should go next. We hope others will join us in this mission.

Outcome of a historical rabbit hole James Ryan has been excavating. I believe this was originally going to be a paper about Klein's systems, but then Ryan found this almost entirely unknown system by Grimes, dug up the surviving material on it, and managed to get in contact with Grimes to get some additional information. Especially interesting that Grimes, an anthropologist, built the system as a means to an end for field work, not as an end in itself. For most other systems, built by AI and computational creativity researchers, "AI that can write stories" is the end goal. That also explains why Grimes didn't bother to publish much on it. The system is based on implementing Propp's formalist folktale morphology in a multilingual generative system, and then using the stories for field work on indigenous languages in Mexico. Brings it closer in a sense to Propp's own work as folklorist. Alas, he found the generated stories were too boring.

2017-08-25
Some natural solutions to the p-value communication problem—and why they won't work
A. Gelman, J. Carlin. J. American Statistical Association, 2017.
Tags: statistics

Excerpt:

It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis is true. [...] At this point it would be natural for statisticians to think that this is a problem of education and communication. If we could just add a few more paragraphs to the relevant sections of our textbooks, and persuade applied practitioners to consult more with statisticians, then all would be well, or so goes this logic. [...] The problem does not seem just to be technical misunderstandings; rather, statistical analysis is being asked to do something that it simply can’t do, to bring out a signal from any data, no matter how noisy. We suspect that, to make progress in pedagogy, statisticians will have to give up some of the claims we have implicitly been making about the effectiveness of our methods.

If you like your takes on the replication crisis pessimistic, here's a paper title for you. They go through five specific proposed solutions and argue for why each won't work: 1) Listen to the statisticians, or clarity in exposition, 2) Confidence intervals instead of hypothesis tests, 3) Bayesian interpretation of one-sided p-values, 4) Focusing on "practical significance" instead of "statistical significance", and 5) Bayes factors. What they do propose is a more fundamental rethink of what statistical methods can and can't reliably tell you, which they admit is a harder sell.

2017-07-27
Twenty things to do with a computer
S. Papert, C. Solomon. MIT AI Lab Memo, 1971.
Tags: creative coding, computers in education

Excerpt:

When people talk about computers in education they do not all have the same image in mind. Some think of using the computer to program the kid; others think of using the kid to program the computer. [...] In the real world computers are used in many different ways. Some are programmed to fly airplanes; not to *tell* a human pilot what to do, but to pull levers with their own electro-mechanical effectors... [...] Why then should computers in schools be confined to computing the sum of the squares of the first twenty odd numbers and similar so-called "problem-solving" uses? Why not use them to produce some action? [...] But our purpose here is not to complain of what other people have not done, but to tell of some of the exciting things you can do with the computer you have now or with the one you will be incited to get by the pages that follow.

The opening quip about using the computer to program the kid vs. using the kid to program the computer is funny and still apt. Besides that, the paper does what it says: explain 20 educational things you can do on a computer, in this case using Logo. Mostly turtle graphics examples, plus some simple generative music, a poetry generator, etc. The thing that most struck me about it is how early 1971 sounds as a date for this. Had this paper been written in the middle of the home-computer revolution I wouldn't have been surprised, but I didn't realize that, at least in some Logo-using classrooms, kids were writing generative poetry in 1971! The paper ends with a brief exhortation to administrators that computers are more affordable than they think. It quotes a cost of "$30 per student per year, for one hour per student per week of terminal time" and extends an offer to help interested educators make their case to funders.

2017-07-20
Computer of a thousand faces: Anthropomorphizations of the computer in design (1965-1975)
T. Vardouli. Dosya, 2012.
Tags: computational design, human-computer interaction

Excerpt:

[L]ooking back to the first encounters of computers and design, one discovers a rich legacy of speculation on the implications of this merging. Inquiry into the early computational era (1965-1975) can therefore expose (part of) the cultural and historical origins of the popular but loosely defined term "computational design." [...] The intense impulse to situate the new entity of the computer in the traditional, empirical processes of design lead to assignments of anthropomorphic roles to the machine, such as the "clerk," the "partner," the "accountant," and others. These different "occupations" were eloquent metaphors denoting different approaches to the ways that the innate characteristics of the computer could be reconciled with the elusive characteristics of design, as well as to the new relationship of the machine as a design actor with the designer-author. The main body of this paper places these metaphors in conversation, thus revealing different models of computation, as well as different processes of design. The purpose of this short survey is to bring forth computational "role models" which survive until today, assert them as historical and cultural artifacts, and present their conceptual counterpoints, re-opening them for discussion.

Surveys anthropomorphic terms used in early work on computational design to describe various roles a computer could take in the design process. The five roles identified are: The Clerk, wherein "the computer would liberate the designers from the tedious, quantitative tasks involved in design, thus allowing them to channel their energy towards the truly creative parts of the design process"; The Partner, "a model which surpassed the rigid division of labor in the design process and called for a partnership between the computer and the designer"; The Wizard, which provides "imminent, seamless spatialization of the user's design intentions, prior even to their verbalization" and politically aims at de-professionalizing architecture; The Surrogate, a kind of scaled-back Wizard where "the role of the machine was to empower non-expert users, who knew very little about design but plenty about their living preferences ... By making inferences on the user's sketches and statements the computer would ideally be able to construct a model of the user and therefore operate as his surrogate, his expert alter ego, his own native architect"; and finally The Accountant, with the computer "objectively recording personal and collective histories and feeding them back to users and communities without 'agency' or 'intelligence' ... The accountant just kept the books; it was up to the inhabitants to own and manage the data in order to reflect on the implications of their past actions and plan their collective futures." These roles are primarily drawn from Nicholas Negroponte's work, except for The Accountant, which comes from Yona Friedman.

2017-05-30
Propositional & analogical generation of coordinated verbal, visual & musical texts
S. Klein et al. SIGART Bulletin, 1982.
Tags: story generation

Excerpt:

We are developing a model for human cognitive processing which assumes that a major component of the rules for calculating behavior are resident outside the individual, in the inherited, collective phenomena that anthropologists call 'culture'. Our model contains rules of behavior encoded in propositional structures such as frames or scripts, plus a method for calculating behavior by analogy. Propositional rules are transformed into situation state descriptions that are related by transformation operators which are also valid for transforming situations by analogy, and for planning by analogy. [...] We use the simulation model as a generator of operas, complete with textual, pictorial and musical output, all derived from a common semantic source. The output is in the medium of a videotape recording, and is generally entitled, Revolt in Flatland.

One of several interesting finds in the history of story generators dug up by James Ryan of UCSC. An automatic opera generator, producing its output directly to videotape, in 1982. This paper has 9 authors, which suggests a fairly large project, but this seems to be the only publication about the project (though it builds on some technologies that have their own publications), and it's been cited one time ever, so the project mostly passed without any attention. Strange, but probably partly due to Sheldon Klein's rather unorthodox way of working and publishing that seems to have led him to build a number of very interesting systems that got relatively little attention. I wonder if any of the videotapes still exist?

2016-12-02
Classification accuracy is not enough: On the evaluation of music genre recognition systems
B.L. Sturm. J. Intelligent Information Systems, 2013.
Tags: computer music, machine learning

Abstract:

We argue that an evaluation of system behavior at the level of the music is required to usefully address the fundamental problems of music genre recognition (MGR), and indeed other tasks of music information retrieval, such as autotagging. A recent review of works in MGR since 1995 shows that most (82%) measure the capacity of a system to recognize genre by its classification accuracy. After reviewing evaluation in MGR, we show that neither classification accuracy, nor recall and precision, nor confusion tables, necessarily reflect the capacity of a system to recognize genre in musical signals. Hence, such figures of merit cannot be used to reliably rank, promote or discount the genre recognition performance of MGR systems if genre recognition (rather than identification by irrelevant confounding factors) is the objective. This motivates the development of a richer experimental toolbox for evaluating any system designed to intelligently extract information from music signals.

Intriguing paper digging into how you determine whether a music-genre recognition system is actually recognizing genre. Argues that the common headline metric, classification accuracy on some labeled dataset, is often misleading (significantly overstating success). Probably applicable to other ML domains as well. One example of why: many systems don't really recognize genre, but specific technical mastering features that happen to correlate strongly with genre in their particular dataset (not musical features at all, but features of the recording/production process). For example, specific reverb effects tend to be associated with live performance in classical halls, and country-western-specializing studios use certain kinds of mastering/production techniques that are rarely used by rock-specializing studios and vice versa. For some applied classification purposes that's fine, but for others it's clearly not too satisfying, and may cause generalization failures.