Skip to content

Commit 7a4cffb

Browse files
committed
[feature] add md versions of others notebooks, update readme
1 parent b834a32 commit 7a4cffb

8 files changed

+662
-7
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -161,10 +161,10 @@ A catalogue of data structures implementation + algorithms and coding problems a
161161

162162
## Others
163163

164-
- [Hit Counter](others/hit_counter.ipynb)
165-
- [Job Scheduler](others/job_scheduler.ipynb)
166-
- [Kaprekar's constant](others/kaprekar's_constant.ipynb)
167-
- [Regular expression matching](others/regex.ipynb)
168-
- [Stable marriage problem](others/stable_marriage_problem.ipynb)
169-
- [Url on the browser](others/url_browser_explanation.ipynb)
170-
- [Word sense disambiguation](others/word_sense_disambiguation.ipynb)
164+
- [Hit Counter](others/hit_counter.md)
165+
- [Job Scheduler](others/job_scheduler.md)
166+
- [Kaprekar's constant](others/kaprekar's_constant.md)
167+
- [Regular expression matching](others/regex.md)
168+
- [Stable marriage problem](others/stable_marriage_problem.md)
169+
- [Url on the browser](others/url_browser_explanation.md)
170+
- [Word sense disambiguation](others/word_sense_disambiguation.md)

others/hit_counter.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
## Hit Counter
2+
3+
Design and implement a HitCounter class that keeps track of requests (or hits). It should support the following operations:
4+
5+
- record(timestamp): records a hit that happened at timestamp
6+
- total(): returns the total number of hits recorded
7+
- range(lower, upper): returns the number of hits that occurred between timestamps lower and upper (inclusive)
8+
9+
What if our system has limited memory?
10+
11+
12+
```python
13+
class HitCounter:
14+
def __init__(self):
15+
self.hits = []
16+
17+
def record(timestamp):
18+
self.hits.append(timestamp)
19+
20+
def total():
21+
return len(self.hits)
22+
23+
def range(lower, upper):
24+
count = 0
25+
for hit in self.hits:
26+
if lower <= hit <= upper:
27+
count += 1
28+
return count
29+
```
30+
31+
Here record and count are constant time operations. Range takes O(N) time.
32+
33+
One tradeoff we can make is to use a sorted list or BST to keep track of the hits. This allows range operation to take O(log N) time. We can use Python's [bisect](https://docs.python.org/3/library/bisect.html) to handle sortedness.
34+
35+
36+
37+
```python
38+
import bisect
39+
40+
41+
class HitCounter:
42+
def __init__(self):
43+
self.hits = []
44+
45+
def record(timestamp):
46+
bisect.insort_left(self.hits, timestamp)
47+
48+
def total():
49+
return len(self.hits)
50+
51+
def range(lower, upper):
52+
low = bisect.bisect_left(self.hits, lower)
53+
high = bisect.bisect_right(self.hits, upper)
54+
return high - low
55+
```
56+
57+
While this is time efficient, it'll still take a lot of space because we are still saving each timestamp into the list.
58+
59+
We can sacrifice accuracy for memory by grouping timestamps into minutes or hours. We'll lose accuracy around the boarders but use upto a constant factor less space.
60+
61+
For our solution, we'll keep track of each group in a tuple, where the first item is a timestamp (in minutes) and the second item is the number of hits occuring within that minute. We'll sort the tuple by minute to allow record to run in O(log N) time.
62+
63+
```
64+
tuple = (minute, hits_within_this_minute)
65+
```
66+
67+
68+
```python
69+
import bisect
70+
from math import floor
71+
72+
class HitCounter:
73+
def __init__(self):
74+
self.hits = []
75+
self.counter = 0
76+
77+
def record(timestamp):
78+
self.counter += 1
79+
80+
minute = floor(timestamp / 60)
81+
82+
idx = bisect.bisect_left([hit[0] for hit in self.hits], minute)
83+
84+
if idx < len(hits) and self.hits[idx][0] == minute:
85+
self.hits[idx] = (minute, self.hits[idx][1] + 1)
86+
else:
87+
self.hits.insert(idx, (minute, 1))
88+
89+
def total():
90+
return self.counter
91+
92+
def range(lower, higher):
93+
lo = floor(lower / 60)
94+
hi = floor(higher / 60)
95+
lo_idx = bisect.bisect_left([hit[0] for hit in self.hits], lo)
96+
hi_idx = bisect.bisect_right([hit[0] for hit in self.hits], hi)
97+
98+
# sum the counts of each tuple within the range(lo, hi)
99+
return sum(self.hits[i][0] for i in range(lo_idx, hi_idx))
100+
```
101+
102+
103+
```python
104+
105+
```

others/job_scheduler.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
## Job Scheduler
2+
3+
Implement a job scheduler that takes in a function f and an integer N, and calls the function after N milliseconds.
4+
5+
6+
### 1st Approach
7+
There are many ways to do this. A more straightforward solution is to spin off a new thread on each function we want to delay, sleep for N milliseconds, then run the function.
8+
9+
10+
11+
```python
12+
import threading
13+
from time import sleep
14+
15+
class Scheduler:
16+
def __init__(self):
17+
pass
18+
19+
def delay(self, func, n):
20+
def sleep_then_call(n):
21+
sleep(n / 1000)
22+
func()
23+
24+
t = threading.Thread(target=sleep_then_call)
25+
t.start()
26+
```
27+
28+
### 2nd Approach
29+
While this works, there's a huge problem with our logic: we spin off a new thread each time we call delay! The number of threads will easily grow as we have more functions to schedule.
30+
31+
We can solve this by having one dedicated thread to call functions, and storing functions we need to call in some data structure, say a list.
32+
33+
Then do polling to check when to run a function. We can store each function along with a unix epoch timestamp that tells when it should run.
34+
35+
After checking the list for any jobs that are due to run, we run them and remove them from the list.
36+
37+
38+
```python
39+
import threading
40+
from time import sleep, time
41+
42+
class Scheduler:
43+
def __init__(self):
44+
self.functions = [] # saves tuple of (function, time-to-run-it)
45+
t = threading.Thread(target=self.poll)
46+
t.start()
47+
48+
def poll(self):
49+
while True:
50+
now = time() * 1000. # change from sec to ms
51+
for function, due in self.functions:
52+
if now > due:
53+
function()
54+
self.functions = [(function, due) for (function, due) in self.functions if due > now]
55+
sleep(0.01)
56+
57+
def delay(self, function, n):
58+
self.functions.append((function, time() * 1000 + n))
59+
```
60+
61+
You can go further by doing:
62+
- Extend the scheduler to allow functions with variables
63+
- Use a heap instead of a list to keep track of the next job to run more efficiently
64+
- Come up with a way to get a due function, say a condition variable instead of polling
65+
- Use a threadpool to run more than one thread without the chance of starvation (when one thread is not able to run because of another running thread)
66+
67+
68+
```python
69+
70+
```

others/kaprekar's_constant.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
## Kaprekar's constant
2+
3+
The number 6174 is known as Kaprekar's contant, after the mathematician who discovered an associated property: for all four-digit numbers with at least two distinct digits, repeatedly applying a simple procedure eventually results in this value.
4+
5+
The procedure is as follows:
6+
7+
- For a given input x, create two new numbers that consist of the digits in x in ascending and descending order.
8+
- Subtract the smaller number from the larger number.
9+
10+
For example, this algorithm terminates in three steps when starting from 1234:
11+
12+
```js
13+
4321 - 1234 = 3087
14+
8730 - 0378 = 8352
15+
8532 - 2358 = 6174
16+
```
17+
Write a function that returns how many steps this will take for a given input N.
18+
19+
## Solution
20+
To solve this imperatively, we can implement a while loop that continually runs the procedure described above until obtaining the number 6174.
21+
22+
For each iteration of the loop we will increment a counter for the number of steps, and return this value at the end.
23+
24+
We also use a helper function that prepends zeros if necessary so that the number always remains four digits long, before creating the ascending and descending integers.
25+
26+
27+
```python
28+
def get_digits(n):
29+
digits = str(n)
30+
if len(digits) == 4:
31+
return digits
32+
else:
33+
return '0' * (4 - len(digits)) + digits
34+
35+
def count_steps(n):
36+
count = 0
37+
while n != 6174:
38+
n = int(''.join(sorted(get_digits(n), reverse=True))) - int(''.join(sorted(get_digits(n))))
39+
count += 1
40+
return count
41+
```
42+
43+
44+
```python
45+
count_steps(12)
46+
```
47+
48+
49+
50+
51+
3
52+
53+
54+
55+
56+
```python
57+
### Recursive solution
58+
def count_steps(n, steps=0):
59+
if n == 6174:
60+
return steps
61+
num = int(''.join(sorted(get_digits(n), reverse=True))) - int(''.join(sorted(get_digits(n))))
62+
63+
return count_steps(num, steps + 1)
64+
```
65+
66+
67+
```python
68+
count_steps(1234)
69+
```
70+
71+
72+
73+
74+
3
75+
76+
77+
78+
79+
```python
80+
81+
```

others/regex.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
## Implement Regular Expression Matching
2+
3+
Implement regular expression matching with the following special characters:
4+
5+
. (period) which matches any single character
6+
* (asterisk) which matches zero or more of the preceding element
7+
That is, implement a function that takes in a string and a valid regular expression and returns whether or not the string matches the regular expression.
8+
9+
For example, given the regular expression "ra." and the string "ray", your function should return true. The same regular expression on the string "raymond" should return false.
10+
11+
Given the regular expression ".*at" and the string "chat", your function should return true. The same regular expression on the string "chats" should return false.
12+
13+
14+
```python
15+
### Approach
16+
17+
# helper function that check first matching character
18+
19+
# base case: if r == '', return s == '' // s = "123" .. recursive(s, r)
20+
# Otherwise if the first thing in r is not an asterisk(*), then match the first character of both r and s. If they match, return match(r[1:], s[1:]). If they don't return false.
21+
# If the first things in r is an asterisk, then
22+
23+
def matches_first_char(s, r):
24+
return s[0] == r[0] or (r[0] == "." and len(s) > 0)
25+
26+
def matches(s, r):
27+
# base case
28+
if r == "":
29+
return s == ""
30+
31+
# The first char in the regex r is not proceeded by a *
32+
if len(r) == 1 or r[0] != "*":
33+
if matches_first_char(s, r):
34+
return matches(s[1:], r[1:])
35+
else:
36+
return False
37+
38+
# The first char in r is proceeded by *
39+
if matches(s, r[2:]):
40+
# Try zero length
41+
return True
42+
43+
# If it doesn't match staight away, try globbing until
44+
# the first character of the string doesn't match anymore.
45+
i = 0
46+
while matches_first_char(s[i:], r):
47+
if matches(s[i+1:], r[2:]):
48+
return True
49+
i += 1
50+
return False
51+
```
52+
53+
54+
```python
55+
r = "tx."
56+
s = "txt"
57+
matches(s, r)
58+
```
59+
60+
61+
62+
63+
True
64+
65+
66+
67+
This takes **O(len(s) * len(r))** time and space, since we potentially need to iterate over each suffix substring again for each character.
68+
69+
Fun fact: Stephen Kleene introduced the * operator in regular expressions and as such, it is sometimes referred to as the Kleene star.
70+
71+
72+
```python
73+
74+
```

0 commit comments

Comments
 (0)