Conversation
1. Added Gc class to sync.py 2. Mounted to ExpirableDictStorage and PersistentDictStorage 3. dict interface changed
| storage_class = fsync.PersistentDictStorage | ||
| else: | ||
| storage_class = fsync.ExpirableDictStorage | ||
| storage_class.maxsize = maxsize |
There was a problem hiding this comment.
This code looks like it changes the maxsize value of the storage class. It will cause side effect for other calls with different value.
There was a problem hiding this comment.
I assumed
if storage_class is None:
means that we're in a 'default' mode so we're in the somewhat private zone... what about _maxsize?
There was a problem hiding this comment.
The point is not about the naming rule. The next code will make problem.
# maxsize of f is 512 here because fsync.PersistentDictStorage.maxsize is 512
@ring.dict(maxsize=512)
def f():
...
# maxsize of both f and g are 128 now becasue fsync.PersistentDictStorage is 128
@ring.dict(maxsize=128)
def g():
...We shouldn't set any kind of class variable for decorator parameters.
There was a problem hiding this comment.
I totally missed that. What a shame. I'll figure out a better way.
ring/func/sync.py
Outdated
| def __init__(self, backend, target_size, expire_f=None): | ||
| assert round(target_size) > 0, 'target_size has to be at least 1' | ||
| assert expire_f is None or callable(expire_f), 'expire_f has to be function or None' | ||
| assert isinstance(backend, type({})), 'backend has to be dict-like' |
There was a problem hiding this comment.
abc.MutableMapping will be better than type({}). https://docs.python.org/3/library/collections.abc.html#collections.abc.MutableMapping
ring/func/sync.py
Outdated
|
|
||
| self._mutex.acquire() | ||
| if (len(self._backend) > self._target_size): | ||
| WorkThread(self).start() |
There was a problem hiding this comment.
Do we take any advantage by using threads? I think gc for dict is cpu-bouded job and there is no benefits by threading. Tell me if I missed something.
There was a problem hiding this comment.
Yes, you're right. The cache itself is in-memory so won't be any heavy I/O issue. However, If the dict entries are big enough, like 1000000, set_value(or in some other methods) can be blocked quite an amount of time. At least we can context switch if we go multithreading.
This is just my opinion and have no issues removing the threading.
There was a problem hiding this comment.
Let's remove it. If the size matters, I think there are 2 reasons to regard threading is overkill.
- If the backend requires IO job, it makes sense. But for dict, even 1000000 iteration is a microsecond problem. In most of cases with reasonable functions, it will be less than 1000000.
- If we correctly add locking to it, gc blocks any kind of adding and removing from the dict - which mean the caller still should wait for gc job.
ring/func/sync.py
Outdated
|
|
||
| class ExpirableDictStorage(fbase.CommonMixinStorage, fbase.StorageMixin): | ||
|
|
||
| maxsize = 128 |
There was a problem hiding this comment.
This value should not be bound to the class
There was a problem hiding this comment.
Again, I can't 100% follow your thoughts with maxsize.
My guess is you're worrying that this value can be easily accessed & modified and my understanding is ExpirableDictStorage & PersistentDictStorage is our implement for default calling so we can control this value.
ring/func/sync.py
Outdated
| self.backend[key] = expired_time, value | ||
|
|
||
| if self.maxsize < len(self.backend): | ||
| if self._gc == None: |
There was a problem hiding this comment.
We may check this part after solving threading design of gc
|
Thanks for the contribution!
The CI failure is about formatting. You can check the result from https://travis-ci.org/youknowone/ring/builds/546264007?utm_source=github_status&utm_medium=notification |
|
Thanks again for your kind & careful advice.
CI Failure was a fail of my own test. I'll correct it at next PR. |
|
Trimmer can be the name too. I don't know the proper name for this kind of job. So if you have any suggestion or find a correct name for it, please use your suggestion to the code. GC was just an expression in issues to make me sure not to be confused later. |
|
@youknowone |
|
I'm working on |
| super(RingRope, self).__init__(*args, **kwargs) | ||
| self.user_interface = self.user_interface_class(self) | ||
| self.storage = self.storage_class(self, storage_backend) | ||
| self.storage = self.storage_class(self, storage_backend, maxsize) |
There was a problem hiding this comment.
Does redis_hash failure related to this change?
There was a problem hiding this comment.
Yes, __init__ is overridden in RedisHashStorage. I'm looking at right now and seems there are some workarounds to get back to work but not sure which one will be the best one.
The problem is that BaseStorage has a init and ExpirableDictStorage & PersistentDictStorage adheres it but RedisHashStorage overrides it.
Can you tell me your opinion?
|
@youknowone |
|
Sorry, i was busy for last week. i will check your recent change soon |
Gcclass to sync.pyExpirableDictStorageandPersistentDictStoragedictinterface changedSolves #115 issue.
DEBUGvariable in Gc class is for debug log purpose. If you know the better way, please, let me know.PriorityQueueto remove expired ones first. Build time would be betweenO(n)orO(n*log(n))so it won't be that harmful.