Learn Programming 101

Thursday, June 6, 2013

Simulating ICSE marks (Hacking into the Indian Education System)

Ref: http://deedy.quora.com/Hacking-into-the-Indian-Education-System?srid=3THA&share=1

I read article above and could not resist trying to simulate the behavior. The spike problem pointed out in the article is similar to one that happens with Hash functions. If the grader grade in certain increments, e.g. 0, 2, 4, 8, 10, or 0, 5, 10, it is possible to replicate the plots.

The plots below are generated assuming there is are internal grades for some project and a number of questions with some grades per question. I also assume that the grader grades in some increments, e.g. e.g. 0, 2, 4, 8, 10, or 0, 5, 10. I generated random samples for 100,000 students for both internal and external marks. The source code is at the end. The first set of plots is without any bias, i.e. the grader was not biased to give good or bad grades. The second set of plots assumes three different types of graders, each one is either not biased (0), biased to give good marks (positive bias) or biased to give bad marks (negative bias).

Without Bias: We can evidence of spikes in almost all cases below, except the last one.

With Bias: With bias, we can still see spikes and also see skew, and bimodal distributions.

Conclusion: It is posible that ICSE graders are grading questions in some common increments and they are not necessarily rigging the grades.

Python Source Code:

With Bias:

from __future__ import division

import math
import numpy
import pylab


def marks2(n=100, internal_marks=[0, 10, 20],

           per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None,
           bias1=0, bias2=0, bias3=0):

    m1 = marks(n, internal_marks, per_question_marks, total, figname, bias1)

    m2 = marks(n, internal_marks, per_question_marks, total, figname, bias2)

    m3 = marks(n, internal_marks, per_question_marks, total, figname, bias3)

    m = numpy.concatenate([m1, m2, m3])

    x = numpy.arange(total+2)

    y, _ = numpy.histogram(m, x)

    print "Mean=%s, stderr=%s min=%s max=%s median=%s" % \

        (m.mean(), m.std() / math.sqrt(len(m)), m.min(), m.max(), numpy.median(m))

    pylab.plot(x[:-1], y, 'r-o')

    pylab.grid()

    pylab.title("Internal=%s, External=%s,\nn=%s, bias1=%d, bias2=%d bias3=%d" % (str(internal_marks), str(per_question_marks), n, bias1, bias2, bias3))

    pylab.xlabel("marks")

    pylab.draw()


def marks(n=100, internal_marks=[0, 10, 20],

          per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None, bias=0):

    m = numpy.zeros(n)

    for i in range(n):

        m[i] = marks_one_student(internal_marks, per_question_marks, total, bias)

    return m

def pos(n, bias):

    return numpy.max([0, numpy.min([numpy.random.randint(n) + bias, n-1])])



def marks_one_student(internal_marks, per_question_marks, total, bias):

    nquestions = (total - numpy.max(internal_marks)) // numpy.max(per_question_marks)

    assert numpy.max(internal_marks) + \

        nquestions * numpy.max(per_question_marks) == total

    marks = internal_marks[pos(len(internal_marks), bias)]

    for i in range(nquestions):

        marks += per_question_marks[pos(len(per_question_marks), bias)]


    return marks

Without Bias:

from __future__ import division



import math

import numpy

import pylab



def marks(n=100, internal_marks=[0, 10, 20],

          per_question_marks=[0, 2, 4, 8, 10], total=100, figname=None):

    m = numpy.zeros(n)

    for i in range(n):

        m[i] = marks_one_student(internal_marks, per_question_marks, total)



    print "Mean=%s, stderr=%s min=%s max=%s median=%s" % \

        (m.mean(), m.std() / math.sqrt(len(m)), m.min(), m.max(), numpy.median(m))

    x = numpy.arange(total+2)

    y, _ = numpy.histogram(m, x)



    print len(x), x

    print len(y), y



    pylab.plot(x[:-1], y, 'r-o')

    pylab.grid()

    pylab.title("Internal=%s, External=%s,\nn=%s, Mean=%.1f, Median=%.1f" % (str(internal_marks), str(per_question_marks), n, m.mean(), numpy.median(m)))

    pylab.xlabel("marks")

    pylab.draw()



    if figname:

        pylab.savefig(figname)



    return m





def marks_one_student(internal_marks, per_question_marks, total):

    nquestions = (total - numpy.max(internal_marks)) // numpy.max(per_question_marks)

    assert numpy.max(internal_marks) + \

        nquestions * numpy.max(per_question_marks) == total



    marks = internal_marks[numpy.random.randint(0, len(internal_marks))]



    for i in range(nquestions):

        marks += per_question_marks[numpy.random.randint(0, len(per_question_marks))]



    return marks

Wednesday, May 8, 2013

Min and Max of 2 numbers

Problem: Find minimum and maximum of given 2 numbers.

Solution in Python:

def max(n, m):
"""
Author: Mayur P Srivastava
"""

if n >= m:
return n
return m

def min(n, m):
"""
Author: Mayur P Srivastava
"""

if n < m:
return n
return m

Concepts Learned: Logical conditions.

Cricket net run rate

Problem: Calculate net run rate for a cricket match, given runs scored by team A (batted first), overs played by team A, runs scored by team B, overs played by team B, winning team name, whether team A was bowled out, whether team B was bowled out, total number of overs in one innings.

Solution in Python:

from __future__ import division

import math

def net_run_rate(runsA, oversA, runsB, oversB,
winning_team,
bowled_outA=False,
bowled_outB=False,
total_overs=50,
eps=1e-8):
"""
Author: Mayur P Srivastava
Reference: http://en.wikipedia.org/wiki/Net_run_rate
"""

assert winning_team in ['AB', 'A', 'B']

if winning_team == 'AB':
return 0.0, 0.0

oversA = parse_overs(oversA)
oversB = parse_overs(oversB)

if abs(oversA - total_overs) > eps:
bowled_outA = True

if abs(oversB - total_overs) > eps and winning_team == 'A':
bowled_outB = True

if bowled_outA:
oversA = total_overs
if bowled_outB:
oversB = total_overs

rrA = runsA / oversA
rrB = runsB / oversB

if winning_team == 'A':
nrrA = rrA - rrB
nrrB = -nrrA
else:
nrrB = rrB - rrA
nrrA = -nrrB

return (nrrA, nrrB), (runsA, runsB), (oversA, oversB)

def parse_overs(o):
completed_overs = math.floor(o)

balls = math.floor(0.5 + 10 * (o - completed_overs))
assert balls >= 0 and balls < 6

return completed_overs + balls / 6.0

Concepts Learned: Maths

Cricket run rate

Problem: Compute current run rate and required run rate for a cricket game. Given: current score, target score, number of overs bowled, total number of overs.

Solution in Python:

from __future__ import division

import math

def calculate_run_rates(current_score, target_score, current_overs, total_overs):
"""

Author: Mayur P Srivastava

In overs, fraction part represents number of balls,
e.g. 5.1, 5.2, 5.3, 5.4, 5.5, 6.0
"""

current_overs = parse_overs(current_overs)
total_overs = parse_overs(total_overs)

if current_overs > 0:
current_rr = current_score / current_overs
else:
current_rr = 0

remaining_overs = total_overs - current_overs
runs_to_win = target_score - current_score + 1

required_rr = runs_to_win / remaining_overs

def parse_overs(o):
completed_overs = math.floor(o)

balls = math.floor(0.5 + 10 * (o - completed_overs))
assert balls >= 0 and balls < 6

return completed_overs + balls / 6.0

Concepts Learned: Maths

Matrix Addition

Problem: Add the given 2 matrixes.

Solution in Python:

def add(A, B):
"""
Author: Mayur P Srivastava
"""

m1, n1 = shape(A)
m2, n2 = shape(B)

if not can_add(m1, n1, m2, n2):
return None

C = create_matrix(m1, n2)

for i in range(m1):
for j in range(n2):
C[i][j] = A[i][j] + B[i][j]

return C

def shape(A):
m = len(A)
n = 0
for row in A:
n2 = len(row)
if n == 0:
n = n2
elif n != n2:
assert False

return m, n

def can_add(m1, n1, m2, n2):
return m1 == m2 and n1 == n2

def create_matrix(m, n, value=0):
matrix = []
for i in range(m):
row = []
for j in range(n):
row.append(value)
matrix.append(row)
return matrix

Concepts Learned: Nested loops and Maths

Matrix Multiplication

Problem: Multiply the given 2 matrixes.

Solution in Python:

def multiply(A, B):
"""
Author: Mayur P Srivastava
"""

m1, n1 = shape(A)
m2, n2 = shape(B)

if not can_multiply(m1, n1, m2, n2):
return None

C = create_matrix(m1, n2)

for i in range(m1):
for j in range(n2):
c = 0
for k in range(n1):
c += A[i][k] * B[k][j]
C[i][j] = c

return C

def shape(A):
m = len(A)
n = 0
for row in A:
n2 = len(row)
if n == 0:
n = n2
elif n != n2:
assert False

return m, n

def can_multiply(m1, n1, m2, n2):
return n1 == m2

def create_matrix(m, n, value=0):
matrix = []
for i in range(m):
row = []
for j in range(n):
row.append(value)
matrix.append(row)
return matrix

Concepts Learned: Maths

Divisibility

Problem: Check whether a given number n is divisible by another number m.

Solution in Python:

def is_divisible(n, m):
"""
Author: Mayur P Srivastava
"""

if m == 0:
return False

return n % m == 0

Concepts Learned: Maths