Choosen frontend it's OwnCloud.
As frontend functionality should be as functional as Copy or Dropbox.
Interface covers user standard user needs. At least:
- Nice intuitive interface (ajax, drag&drop interface)
- Upload/Download files from your computer via WebUI
- Public link manager: it allow to create or delete public links to files or folders
- File changes history
- Restore deleted files
- Encrypted storage
- Share files with a link. You can include decryption key in the URL (like mega)
- Share folders with a link. You can include decryption key in the URL (like mega)
- Encryption JavaScript on client side. Before data goes out of your computer, it should be encrypted, to prevent man-in-the-middle or other hacking issues.
- Encryption on server side. Data stored it’s encrypted and nobody than you can access this data.
- Configure upload transfer rate
- File preview on file manager (without enter to another page)
First Start
First time you are login to frontend it should:
- Create a user & administator
- Create secure password (checked by a secure metter)
Tracking Owncloud model
OwnCloud has 2 tracking methods, that we should block
- It's trying to connect to some domain (probably www.owncloud.org)
- It's sending code to browser, and browser it's connecting to tracking page
Encryption proposed model – General Overview
Security encryption has 3 layers. It's well known that more layers does not add more security, but each encryption protects in a step of process.
On file transfers, file it's encrypted before leaves browser (on upload) or when it arribes to browser (on download).. Original filename it's used, but encrypted content.
On file storage, OwnCloud encrypts it based user key. And stored in a DM-Crypt
Upload File model
When you upload a file to storage it should:
1. Remove critical file metadata before leave browser
2. Encrypt it with JavaScript before leaves the browser (if it's checked on upload form, or user configuration. It's called «secure mode»)
Upload/Download Client Browser JavaScript Encryption
On upload or download, or collaboration file, it asks you password to access your file, if it's configured to be in «secure mode». Password never leaves your browser.
In this way,
- Onwcloud doesn't know what's inside your files
- Each file can have a different password to be decrypted (usefull to share files with other people)
- And can be running without «secure mode» in order to be less secure but more usable, for a regular user.
OwnCloud Bug Fixing
As all softwares, OwnCloud has some error sreported on bugtracker to be solved, too. All this list it's included on development OwnCloud Appendix 3.1
OwnCloud Implementation Step 1
In a first implementation phase, OwnCloud loose collaborative online ocument editing features for files in «secure mode» (wiles which where encrypted in client-side javascript ).
OwnCloud Implementation Step 2
In a second implementation phase, OwnCloud get again collaborative online document editing features for files in «secure mode», by developing new P2P connection for collaborative platform. Described in Service 7.
Share Link proposal
When user creates a link to share a document, it should be possible to add password to decrypt file without asking any password, like mega, if this file it's in «secure mode»
When User 2 goes to shared link, it goes to the User 1 frontend and downloads the file.
When it’s completly downloaded, browser javascript asks for the decrypting password if it's needed and if isn’t included on the link.
If decrypting password it’s included on the link, decrypting password would never leave your browser, or log on the HTTP server Request URL
File Indexation
When a file isn't on «secure mode» OwnCloud index it, the filename (as is doing now) and its content.
When user it's searching for a file, can choose if want to search only by filename, by content, or both.
Dropbox - GDrive - Mega | OwnCloud |
---|---|
Upload files from your computer via WebUI | X |
Sync a folder of your computer via Agent | X |
Link manager (remove download links to be private again) | X |
Filesystem management from WebUI | X |
Separated view of uploaded images | X |
Filechanges history | X |
Feed RSS of file changes | X |
Collaborative online editing | X |
Restore and download deleted files | X |
Encrypted storage | X |
Mobile applications. Download and steam your files | X |
API for 3rd parties plugins | X |
WebUI in lot of languages | X |
List of recent added/modified files | X |
Firefox extension | X |
Create collaborative text document | X |
Share files with a link (including decryption key on URL or not) |
X (without decryption key) |
Share folders with a link (including decryption key on URL or not) |
X (without decryption key) |
Share files by mail (including decryption key on URL or not) |
|
Share folders by mail (including decryption key on URL or not) |
|
Encryption on client side and server side | Server Side |
Encryption key management | |
2048-bit RSA (or better) | ? |
RSA Key based on you password + entropy | ? |
Configure upload tranfer rate | |
Lost password (master crypto key) = data loss | X (but it have a master key) |
Create collaborative presentation | |
Create collaborative spreadsheet | |
Create collaborative form/poll | |
Create collaborative draw | |
Show file preferences (which apps can open, owners, editors, etc) | |
Stores cache on web browser. It loads faster | |
File preview on file manager (without enter to another page) | |
Owner list | |
Connection history | |
Mark documents as favorites | |
File manager view files in grid | |
File manager view files in list | |
Statistics hardisk space | |
Statistics used bandwidth | |
Filechanges history search by date | |
Drag'n'Drop files to Upload | X |
Option to not keep history | |
Transfer folder to any other user | |
List connected devices | |
Resume file upload by WebUI | |
Resume file upload by Agent | |
Create public upload filders | |
Share folders, files and links with other users | |
Create user groups | |
Applications (bookmarks, IM, music, etc) | X |
Nice intuitive interface | X |
Metatag system to improve finding system | |
Resume filetransfers via WebUi |
Total develop | Hours 377h | Cost: 15080€ |
---|
Second Month | Hours: 112h | Cost: 4480€ |
---|---|---|
* Can't Change Full Name for Users | 3,75h | 150€ |
* OC7.0.0 Cannot upload 2 directories through Drag and Drop | 6,25h | 250€ |
* A folder shared with two groups appears twice for an user in these two groups. | 7,5h | 300€ |
* rd level files/folders looking like they do not inherit privileges/permissions | 6,25h | 250€ |
* "Pictures" view mode bug with folder ending by a space | 6,25h | 250€ |
* [7.0.1] Confusion with share to user name | 6,25h | 250€ |
* [7.0.1] Drag and drop folder gives no error | 7,5h | 300€ |
* UI improvements for external storage configuration | 5h | 200€ |
* Folder specific views | 6,25h | 250€ |
* Show that the encryption recovery key password is set (usability) | 5h | 200€ |
* restoring deleted shares | 5h | 200€ |
* Seamless integration with Libreoffice | 60h | 2400€ |
Third Month | Hours: 92,5h | Cost: 3700€ |
---|---|---|
* Share files with a link (including decryption key on URL or not) Now cannot include decryption key on URL |
6,25h | 250€ |
* Share folders with a link (including decryption key on URL or not) Now cannot include decryption key on URL |
6,25h | 250€ |
* Share files by mail (including decryption key on URL or not) |
7,5h | 300€ |
* Share folders by mail including decryption key on URL or not) |
6,25h | 250€ |
* Configure upload transfer rate | 7,5h | 300€ |
* Stores cache on web browser to load webpage faster | 6,25h | 250€ |
* Statistics hardisk space | 3,75h | 150€ |
* JS to encrypt on webbrowser files before upload | 6,25h | 250€ |
* List connected devices | 5h | 200€ |
* Resume file upload on WebUI | 10h | 400€ |
* Create public upload folders with choosen criteria \\(not upload more than 1gb or whatever) | 7,5h | 300€ |
* Share folders, files and links with other OwnCloud users | 8,75h | 350€ |
* Resume file downloads on WebUi | 8,75h | 350€ |
* Manage password & encryption keys | 5h | 200€ |
* Private password for restore | 3,75h | 150€ |
This document addresses the analysis of i2p-Tahoe-LAFS version 1.10 in order to implement three new features:
Quota management
Connection to multiple Helpers
Automatic spreading of Introducers & Helpers furls.
We begin with a short introduction to Tahoe-LAFS, and then proceed to analyse the requirements for the 3 proposed features. The analysis includes a review of how related functionality is now implemented in Tahoe-LAFS, which files should be modified and what modifications should be included for each of those files.
As a short reminder, Tahoe-LAFS grid is composed of several types of nodes:
Introducer: keeps track of StorageServer nodes connected to the grid and publishes it so that StorageClients know which are the nodes they can connect to.
StorageServer: form the distributed data store.
HelperServer: a intermediate server which can be used to minimize upload time. Due to the redundancy introduced by erasure coding, uploading a file to the grid can be an order of magnitude slower than reading from it. The HelperServer acts as a proxy which receives the encrypted data from the StorageClient (encrypted, but with no redundancy), performs the erasure encoding and distributes files to StorageServers in the grid.
StorageClient: once they get the list of StorageServers in the grid from one introducer, they can connect to read and write data on the grid. Read operations are performed connecting directly to StorageServer nodes. Write operations can be performed connecting directly or using a HelperServer (only for immutable files as of Tahoe-LAFS 1.10.0).
For a full introduction to Tahoe-LAFS, see the docs folder on source tree. You can also check the tutorial published on Nilestore project's wiki1.
Diagram
showing tahoe-lafs network topology (from tahoe-lafs official
documentation). (Notice that Introducers and Helpers are not shown
in it)
Tahoe-LAFS is developed in Python (2.6.6 – 2.x), and has a great test code coverage (around 92% for 1.10). In this paragraph we make a short description of Tahoe-LAFS source code.
We start by looking at Tahoe-LAFS source folder structure:
allmydata ├── frontends ├── immutable │ └── downloader ├── introducer ├── mutable ├── scripts ├── storage ├── test ├── util ├── web │ └── static │ ├── css │ └── img └── windows |
As a general rule, code specific for each feature's Client and Server is placed under that feature's folder, as client.py and server.py. All test files are placed under test folder.
Some files relevant to the rest of the document2:
allmydata/client.py: this is the main file for the Tahoe-LAFS, contains the Client class which initializes most of the services (StorageFarmBroker, StorageServer, Web/ftp/sFtp frontends, Helper...)
allmydata/introducer/server.py: the server side of the Introducer.
allmydata/introducer/client.py: the client side of the Introducer.
storage/server.py the server side of the storage.
allmydata/immutable/upload.py manages connections to the Helper from the client side.
allmydata/immutable/offloaded.py the Helper, server side
allmydata/storage_client.py functions related to the storage client.
Support for quota management ('accounting') in Tahoe-LAFS has been an ongoing development for several years. The schema being used is based on the use of accounts, which could be managed by a central AuthorityServer or independently by each of the StorageServers (this option being suited only for smaller grids). A detailed description of the intended accounting architecture and development roadmap can be found in the project's documentation3.
The objective of quota management in CommunityCube is to ensure that a user which contributes to the grid with a given amount space can use the equivalent of that amount in it.
User accounts pose obvious risks regarding privacy/anonymity concerns. We have thus investigated a different approach to the problem: control quota management from the StorageClient itself.
This implementation comes, however, with its own set of drawbacks: it can be easily defeated by using a modified StorageClient and it requires to keep a local record of files stored in the grid4 or (something Tahoe-LAFS does not require as long as you keep a copy of the capabilities you are interested in), which is also a big threat from the privacy point of view. As an alternative to keeping a record of every uploaded file, users can be forced to use a single directory as root for all the files they upload (which is known as a rootcap5). The content under that directory can be accounted with a call to 'tahoe deep-check --add-lease ALIAS:', where ALIAS stands for the alias of the rootcap directory.
This approach seems to be the most compatible with CommunityCube's objectives, and its adoption relies on the belief that CommunityCube's users will be 'fair-play' to the rest of the community members.
The proposed system can be easily bypassed by malicious actors, but it will however ensure that the grid is not abused due to user mistakes or lack of knowledge on the grid's working principles and capacity.
Quota management will be handled by the StorageClient, which imposes the limits on what can be uploaded to the grid. When a file is to be uploaded, the StorageClient:
Checks that the storage server is running and writable
Calculates the space it is sharing in the associated storage server.
Available disk space
Reserved disk space (minimum free space to be reserved)
Size of stored shares
The size of leases it holds on files stored on grid (requires a catalog of uploaded files and lease expiration/renewal tracking).
Estimates the assigned space as 'Sharing space (available + stored shares)'
Checks that Used space (i.e. sum of leases) is smaller than 'Sharing space'.
Retrieve the grid's “X out K” parameters used in erasure encoding.
Verifies that predicted used space and reports an error if the available quota is exceeded.
We will have a look at how the following functionality is implemented in Tahoe-LAFS:
The upload of a file (to the Helper or directly to other StorageServers via the StorageFarmBroker).
Check if the StorageServer is running.
The statistics associated with the space used and available on the StorageServer.
The moment the leases are renewed in remote StorageServers.
In the next paragraphs we show how the system works with the corresponding code.
The upload of a file
The upload takes place at different classes depending on the type of data being uploaded. For immutable files, it is the Uploader service, which is defined in allmydata/immutable/upload.py. For mutable files, it is defined in allmydata/mutable/filenode.py.
These functions can be accessed from the main client, using an intermediate call to a NodeMaker instance, or directly calling the uploader:
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin):
(…)
# these four methods are the primitives for creating filenodes and # dirnodes. The first takes a URI and produces a filenode or (new-style) # dirnode. The other three create brand-new filenodes/dirnodes.
def create_node_from_uri(self, write_uri, read_uri=None, deep_immutable=False, name="<unknown name>"): # This returns synchronously. # Note that it does *not* validate the write_uri and read_uri; instead we # may get an opaque node if there were any problems. return self.nodemaker.create_from_cap(write_uri, read_uri, deep_immutable=deep_immutable, name=name)
def create_dirnode(self, initial_children={}, version=None): d = self.nodemaker.create_new_mutable_directory(initial_children, version=version) return d
def create_immutable_dirnode(self, children, convergence=None): return self.nodemaker.create_immutable_directory(children, convergence)
def create_mutable_file(self, contents=None, keysize=None, version=None): return self.nodemaker.create_mutable_file(contents, keysize, version=version)
def upload(self, uploadable): uploader = self.getServiceNamed("uploader") return uploader.upload(uploadable)
|
File: allmydata/nodemaker.py
class NodeMaker: implements(INodeMaker)
(…)
def create_mutable_file(self, contents=None, keysize=None, version=None): if version is None: version = self.mutable_file_default n = MutableFileNode(self.storage_broker, self.secret_holder, self.default_encoding_parameters, self.history) d = self.key_generator.generate(keysize) d.addCallback(n.create_with_keys, contents, version=version) d.addCallback(lambda res: n) return d
def create_new_mutable_directory(self, initial_children={}, version=None): # initial_children must have metadata (i.e. {} instead of None) for (name, (node, metadata)) in initial_children.iteritems(): precondition(isinstance(metadata, dict), "create_new_mutable_directory requires metadata to be a dict, not None", metadata) node.raise_error() d = self.create_mutable_file(lambda n: MutableData(pack_children(initial_children, n.get_writekey())), version=version) d.addCallback(self._create_dirnode) return d
def create_immutable_directory(self, children, convergence=None): if convergence is None: convergence = self.secret_holder.get_convergence_secret() packed = pack_children(children, None, deep_immutable=True) uploadable = Data(packed, convergence) d = self.uploader.upload(uploadable) d.addCallback(lambda results: self.create_from_cap(None, results.get_uri())) d.addCallback(self._create_dirnode) return d
|
File: allmydata/immutable/upload.py
class Uploader(service.MultiService, log.PrefixingLogMixin): """I am a service that allows file uploading. I am a service-child of the Client. """ (...)
def upload(self, uploadable): """ Returns a Deferred that will fire with the UploadResults instance. """ assert self.parent assert self.running
uploadable = IUploadable(uploadable) d = uploadable.get_size() def _got_size(size): default_params = self.parent.get_encoding_parameters() precondition(isinstance(default_params, dict), default_params) precondition("max_segment_size" in default_params, default_params) uploadable.set_default_encoding_parameters(default_params)
if self.stats_provider: self.stats_provider.count('uploader.files_uploaded', 1) self.stats_provider.count('uploader.bytes_uploaded', size)
if size <= self.URI_LIT_SIZE_THRESHOLD: uploader = LiteralUploader() return uploader.start(uploadable) else: eu = EncryptAnUploadable(uploadable, self._parentmsgid) d2 = defer.succeed(None) storage_broker = self.parent.get_storage_broker() if self._helper: uploader = AssistedUploader(self._helper, storage_broker) d2.addCallback(lambda x: eu.get_storage_index()) d2.addCallback(lambda si: uploader.start(eu, si)) else: storage_broker = self.parent.get_storage_broker() secret_holder = self.parent._secret_holder uploader = CHKUploader(storage_broker, secret_holder) d2.addCallback(lambda x: uploader.start(eu))
self._all_uploads[uploader] = None if self._history: self._history.add_upload(uploader.get_upload_status()) def turn_verifycap_into_read_cap(uploadresults): # Generate the uri from the verifycap plus the key. d3 = uploadable.get_encryption_key() def put_readcap_into_results(key): v = uri.from_string(uploadresults.get_verifycapstr()) r = uri.CHKFileURI(key, v.uri_extension_hash, v.needed_shares, v.total_shares, v.size) uploadresults.set_uri(r.to_string()) return uploadresults d3.addCallback(put_readcap_into_results) return d3 d2.addCallback(turn_verifycap_into_read_cap) return d2 d.addCallback(_got_size) def _done(res): uploadable.close() return res d.addBoth(_done) return d
|
We have highlighted the callback that will start the upload Upload._got_size and the three available ways to upload immutable content: with a LiteralUploader for small files, with a Helper or directly with the StorageFarmBroker.
In the case of mutable files, we have to check the moment when we upload a new file and when we want to modify it (of fully overwrite it via MutableFileNode.overwrite or MutableFileNode.update):
File: allmydata/mutable/filenode.py
class MutableFileNode: implements(IMutableFileNode, Icheckable) implements(IMutableFileNode, ICheckable)
def __init__(self, storage_broker, secret_holder, default_encoding_parameters, history): self._storage_broker = storage_broker self._secret_holder = secret_holder self._default_encoding_parameters = default_encoding_parameters self._history = history self._pubkey = None # filled in upon first read self._privkey = None # filled in if we're mutable # we keep track of the last encoding parameters that we use. These # are updated upon retrieve, and used by publish. If we publish # without ever reading (i.e. overwrite()), then we use these values. self._required_shares = default_encoding_parameters["k"] self._total_shares = default_encoding_parameters["n"] self._sharemap = {} # known shares, shnum-to-[nodeids] self._most_recent_size = None (...) def create_with_keys(self, (pubkey, privkey), contents, version=SDMF_VERSION): """Call this to create a brand-new mutable file. It will create the shares, find homes for them, and upload the initial contents (created with the same rules as IClient.create_mutable_file() ). Returns a Deferred that fires (with the MutableFileNode instance you should use) when it completes. """ self._pubkey, self._privkey = pubkey, privkey pubkey_s = self._pubkey.serialize() privkey_s = self._privkey.serialize() self._writekey = hashutil.ssk_writekey_hash(privkey_s) self._encprivkey = self._encrypt_privkey(self._writekey, privkey_s) self._fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s) if version == MDMF_VERSION: self._uri = WriteableMDMFFileURI(self._writekey, self._fingerprint) self._protocol_version = version elif version == SDMF_VERSION: self._uri = WriteableSSKFileURI(self._writekey, self._fingerprint) self._protocol_version = version self._readkey = self._uri.readkey self._storage_index = self._uri.storage_index initial_contents = self._get_initial_contents(contents) return self._upload(initial_contents, None)
(…)
def overwrite(self, new_contents): """ I overwrite the contents of the best recoverable version of this mutable file with new_contents. This is equivalent to calling overwrite on the result of get_best_mutable_version with new_contents as an argument. I return a Deferred that eventually fires with the results of my replacement process. """ # TODO: Update downloader hints. return self._do_serialized(self._overwrite, new_contents)
(…)
def upload(self, new_contents, servermap): """ I overwrite the contents of the best recoverable version of this mutable file with new_contents, using servermap instead of creating/updating our own servermap. I return a Deferred that fires with the results of my upload. """ # TODO: Update downloader hints return self._do_serialized(self._upload, new_contents, servermap)
def modify(self, modifier, backoffer=None): """ I modify the contents of the best recoverable version of this mutable file with the modifier. This is equivalent to calling modify on the result of get_best_mutable_version. I return a Deferred that eventually fires with an UploadResults instance describing this process. """ # TODO: Update downloader hints. return self._do_serialized(self._modify, modifier, backoffer)
|
In addition to the relevant functions we have also highlighted the values for k and n, which are required to estimate how much disk space will take a new file.
Check if the StorageServer is running.
The StorageServer is initialized in allmydata/client.py, in function Client.init_storage, according to configuration values.
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin):
(…)
def init_storage(self): # should we run a storage server (and publish it for others to use)? if not self.get_config("storage", "enabled", True, boolean=True): return readonly = self.get_config("storage", "readonly", False, boolean=True)
storedir = os.path.join(self.basedir, self.STOREDIR)
data = self.get_config("storage", "reserved_space", None) try: reserved = parse_abbreviated_size(data) except ValueError: log.msg("[storage]reserved_space= contains unparseable value %s" % data) raise if reserved is None: reserved = 0 (…) ss = StorageServer(storedir, self.nodeid, reserved_space=reserved, discard_storage=discard, readonly_storage=readonly, stats_provider=self.stats_provider, expiration_enabled=expire, expiration_mode=mode, expiration_override_lease_duration=o_l_d, expiration_cutoff_date=cutoff_date, expiration_sharetypes=expiration_sharetypes) self.add_service(ss)
d = self.when_tub_ready() # we can't do registerReference until the Tub is ready def _publish(res): furl_file = os.path.join(self.basedir, "private", "storage.furl").encode(get_filesystem_encoding()) furl = self.tub.registerReference(ss, furlFile=furl_file) ann = {"anonymous-storage-FURL": furl, "permutation-seed-base32": self._init_permutation_seed(ss), }
current_seqnum, current_nonce = self._sequencer()
for ic in self.introducer_clients: ic.publish("storage", ann, current_seqnum, current_nonce, self._node_key)
d.addCallback(_publish) d.addErrback(log.err, facility="tahoe.init", level=log.BAD, umid="aLGBKw")
|
To find out if the StorageServer is running we have to recover the parent of the service we are at (i.e. Uploader ). We will be working with services which are 'children' of main Client instance, and we can check if the client is running a given service (i.e. the storage service) as it is done in allmydata/web/root.py:
File: allmydata/web/root.py
class Root(rend.Page): (...)
def __init__(self, client, clock=None, now=None):
(...) try: s = client.getServiceNamed("storage") except KeyError: s = None (...) |
The statistics associated with the space used and available on the StorageServer.
From the StorageServer service we get access to the StorageServer.get_stats function:
class StorageServer(service.MultiService, Referenceable):
(…)
def get_stats(self): # remember: RIStatsProvider requires that our return dict # contains numeric values. stats = { 'storage_server.allocated': self.allocated_size(), } stats['storage_server.reserved_space'] = self.reserved_space for category,ld in self.get_latencies().items(): for name,v in ld.items(): stats['storage_server.latencies.%s.%s' % (category, name)] = v
try: disk = fileutil.get_disk_stats(self.sharedir, self.reserved_space) writeable = disk['avail'] > 0
# spacetime predictors should use disk_avail / (d(disk_used)/dt) stats['storage_server.disk_total'] = disk['total'] stats['storage_server.disk_used'] = disk['used'] stats['storage_server.disk_free_for_root'] = disk['free_for_root'] stats['storage_server.disk_free_for_nonroot'] = disk['free_for_nonroot'] stats['storage_server.disk_avail'] = disk['avail'] except AttributeError: writeable = True except EnvironmentError: log.msg("OS call to get disk statistics failed", level=log.UNUSUAL) writeable = False
if self.readonly_storage: stats['storage_server.disk_avail'] = 0 writeable = False
stats['storage_server.accepting_immutable_shares'] = int(writeable) s = self.bucket_counter.get_state() bucket_count = s.get("last-complete-bucket-count") if bucket_count: stats['storage_server.total_bucket_count'] = bucket_count return stats
|
The leases held by the StorageClient, and their equivalent size on disk (i.e. the amount of storage we have spent).
Leases are created whenever we upload a new file, and they are renewed from the client at three points: in immutable/checker.py (lease renewal for immutable files), in mutable/servermap.py (called from mutable/checker.py, lease renewal for mutable files) and in scripts/tahoe_check.py (cli interface).
File: allmydata/immutable/checker.py
class Checker(log.PrefixingLogMixin): """I query all servers to see if M uniquely-numbered shares are available.
(…)
def _get_buckets(self, s, storageindex): """Return a deferred that eventually fires with ({sharenum: bucket}, serverid, success). In case the server is disconnected or returns a Failure then it fires with ({}, serverid, False) (A server disconnecting or returning a Failure when we ask it for buckets is the same, for our purposes, as a server that says it has none, except that we want to track and report whether or not each server responded.)"""
rref = s.get_rref() lease_seed = s.get_lease_seed() if self._add_lease: renew_secret = self._get_renewal_secret(lease_seed) cancel_secret = self._get_cancel_secret(lease_seed) d2 = rref.callRemote("add_lease", storageindex, renew_secret, cancel_secret) d2.addErrback(self._add_lease_failed, s.get_name(), storageindex)
(...) |
File: allmydata/mutable/servermap.py
class ServermapUpdater: def __init__(self, filenode, storage_broker, monitor, servermap, mode=MODE_READ, add_lease=False, update_range=None): """I update a servermap, locating a sufficient number of useful shares and remembering where they are located.
"""
(…)
def _do_read(self, server, storage_index, shnums, readv): ss = server.get_rref() if self._add_lease: # send an add-lease message in parallel. The results are handled # separately. This is sent before the slot_readv() so that we can # be sure the add_lease is retired by the time slot_readv comes # back (this relies upon our knowledge that the server code for # add_lease is synchronous). renew_secret = self._node.get_renewal_secret(server) cancel_secret = self._node.get_cancel_secret(server) d2 = ss.callRemote("add_lease", storage_index, renew_secret, cancel_secret) # we ignore success d2.addErrback(self._add_lease_failed, server, storage_index) d = ss.callRemote("slot_readv", storage_index, shnums, readv) return d (...) |
File: allmydata/scripts/tahoe_check.py
def check_location(options, where): stdout = options.stdout stderr = options.stderr nodeurl = options['node-url'] if not nodeurl.endswith("/"): nodeurl += "/" try: rootcap, path = get_alias(options.aliases, where, DEFAULT_ALIAS) except UnknownAliasError, e: e.display(stderr) return 1 if path == '/': path = '' url = nodeurl + "uri/%s" % urllib.quote(rootcap) if path: url += "/" + escape_path(path) # todo: should it end with a slash? url += "?t=check&output=JSON" if options["verify"]: url += "&verify=true" if options["repair"]: url += "&repair=true" if options["add-lease"]: url += "&add-lease=true"
resp = do_http("POST", url) if resp.status != 200: print >>stderr, format_http_error("ERROR", resp) return 1 jdata = resp.read() if options.get("raw"): stdout.write(jdata) stdout.write("\n") return 0 data = simplejson.loads(jdata)
|
File: allmydata/client.py
Introduce code in functions used to create new nodes to keep track of files uploaded to the grid. It may be required to move this accounting code down to the immutable/Uploader.upload function or the mutable/MutableFileNode.update/overwrite functions if they are called directly from other parts of Tahoe-LAFS (not exclusively from the client). Alternatively, if we are using the single rootcap strategy, force any new file to lie under the rootcap.
Create a new function in client that recovers the StorageServer service, access its usage statistics, the erasure encoding parameters and the statistics for uploaded files to estimate the remaining quota.
Files: immutable/checker.py, mutable/servermap.py, scripts/tahoe_check.py
Introduce accounting of the times a lease is renewed against the database of uploaded files (if we are creating a local database, this would not be required if we are using the single root dir).
File: allmydata/web/root.py
Add functionality to show the updated quota data.
File: allmydata/web/welcome.xhtml
Modify the template to show shared remaining/total quota information.
File: allmydata/test/test_client.py
Add tests to verify that new uploads are properly accounted in the uploads database (or that they lie under the rootcap dir)
File: allmydata/test/test_storage.py
Add tests to verify that new uploads are properly accounted in the uploads database (or that they lie under the rootcap dir)
File: docs/architecture.rst
Include a brief description of the quota management system implementation.
File: docs/quotas.rst
Create a new file under docs describing in detail the implemented quota system.
Helpers are used in Tahoe-LAFS to cope with the overhead factor imposed by erasure coding and the asymmetric bandwith of upload/download in ADSL connections. Uploading a file requires K/X (considering we use an X out of K storage scheme) more bandwith than the file size (and the corresponding download operation from the grid. Given these asymmetric bandwith requirements and upload/download channel capacities, the upload operation can be orders of magnitude slower than its corresponding download.
To help ease this problem, Helper nodes (assumed to have an uplink with greater capacity than the user's), receive the cyphertext directly from the StorageClient (i.e. files that have already been encrypted, but have not yet been segmented and erasure-coded), erasure-codes it and distributes the resulting shares to StorageServers. This way the size of data to be uploaded by the StorageClient is limited to the size of the file to be uploaded, with the overhead being handled by the Helper.
As of version 1.10, i2p-Tahoe-LAFS can only be configured to use a single helper server, which (if used) must be specified in tahoe.cfg. Allowing the StorageClient to choose among a list of available helpers will add flexibility to the network and allow the StorageClient to choose the least-loaded Helper at a given moment.
Instead of the single value now stored in tahoe.cfg, we need a list of Helpers and the possibility to select one of them from that list using a particular selection algorithm.
Allow for a variable number of helpers, statically contained in “BASEDIR/helpers.”
Before sending a file to the helper
Check all the helpers to retrieve their statistics.
Choose the helper with best stats.
Send the cyphertext to the chosen Helper
When a new client is started, it recovers the helper.furl from section [client] in tahoe.cfg. Its value is then used to initialize the Uploader service, as seen below:
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin):
(…)
def init_client(self): helper_furl = self.get_config("client", "helper.furl", None) if helper_furl in ("None", ""): helper_furl = None
DEP = self.encoding_params DEP["k"] = int(self.get_config("client", "shares.needed", DEP["k"])) DEP["n"] = int(self.get_config("client", "shares.total", DEP["n"])) DEP["happy"] = int(self.get_config("client", "shares.happy", DEP["happy"]))
self.init_client_storage_broker() self.history = History(self.stats_provider) self.terminator = Terminator() self.terminator.setServiceParent(self) self.add_service(Uploader(helper_furl, self.stats_provider, self.history))
|
In the Uploader class, we find the code to initialize the helper connection and handle when the server's connection is set or lost and recover helper information:
File: allmydata/immutable/upload.py
class Uploader(service.MultiService, log.PrefixingLogMixin): (...) def __init__(self, helper_furl=None, stats_provider=None, history=None): self._helper_furl = helper_furl self.stats_provider = stats_provider self._history = history self._helper = None self._all_uploads = weakref.WeakKeyDictionary() # for debugging log.PrefixingLogMixin.__init__(self, facility="tahoe.immutable.upload") service.MultiService.__init__(self)
def startService(self): service.MultiService.startService(self) if self._helper_furl: self.parent.tub.connectTo(self._helper_furl, self._got_helper)
def _got_helper(self, helper): self.log("got helper connection, getting versions") default = { "http://allmydata.org/tahoe/protocols/helper/v1" : { }, "application-version": "unknown: no get_version()", } d = add_version_to_remote_reference(helper, default) d.addCallback(self._got_versioned_helper)
def _got_versioned_helper(self, helper): needed = "http://allmydata.org/tahoe/protocols/helper/v1" if needed not in helper.version: raise InsufficientVersionError(needed, helper.version) self._helper = helper
def _lost_helper(self): self._helper = None
def get_helper_info(self): # return a tuple of (helper_furl_or_None, connected_bool) return (self._helper_furl, bool(self._helper)) |
Finally on the upload function, if the Helper connection is available, it is used, and the node's storage broker when not:
File: allmydata/immutable/upload.py
class Uploader(service.MultiService, log.PrefixingLogMixin):
(...)
def upload(self, uploadable): """ Returns a Deferred that will fire with the UploadResults instance. """ assert self.parent assert self.running
uploadable = IUploadable(uploadable) d = uploadable.get_size() def _got_size(size): default_params = self.parent.get_encoding_parameters() precondition(isinstance(default_params, dict), default_params) precondition("max_segment_size" in default_params, default_params) uploadable.set_default_encoding_parameters(default_params)
if self.stats_provider: self.stats_provider.count('uploader.files_uploaded', 1) self.stats_provider.count('uploader.bytes_uploaded', size)
if size <= self.URI_LIT_SIZE_THRESHOLD: uploader = LiteralUploader() return uploader.start(uploadable) else: eu = EncryptAnUploadable(uploadable, self._parentmsgid) d2 = defer.succeed(None) storage_broker = self.parent.get_storage_broker() if self._helper: uploader = AssistedUploader(self._helper, storage_broker) d2.addCallback(lambda x: eu.get_storage_index()) d2.addCallback(lambda si: uploader.start(eu, si)) else: storage_broker = self.parent.get_storage_broker() secret_holder = self.parent._secret_holder uploader = CHKUploader(storage_broker, secret_holder) d2.addCallback(lambda x: uploader.start(eu))
self._all_uploads[uploader] = None if self._history: self._history.add_upload(uploader.get_upload_status()) def turn_verifycap_into_read_cap(uploadresults): # Generate the uri from the verifycap plus the key. d3 = uploadable.get_encryption_key() def put_readcap_into_results(key): v = uri.from_string(uploadresults.get_verifycapstr()) r = uri.CHKFileURI(key, v.uri_extension_hash, v.needed_shares, v.total_shares, v.size) uploadresults.set_uri(r.to_string()) return uploadresults d3.addCallback(put_readcap_into_results) return d3 d2.addCallback(turn_verifycap_into_read_cap) return d2 d.addCallback(_got_size) def _done(res): uploadable.close() return res d.addBoth(_done) return d
|
Rendering related to the uploader is made at the web interface:
File: allmydata/web/root.py
class Root(rend.Page):
def data_helper_furl_prefix(self, ctx, data): try: uploader = self.client.getServiceNamed("uploader") except KeyError: return None furl, connected = uploader.get_helper_info() if not furl: return None # trim off the secret swissnum (prefix, _, swissnum) = furl.rpartition("/") return "%s/[censored]" % (prefix,)
def data_helper_description(self, ctx, data): if self.data_connected_to_helper(ctx, data) == "no": return "Helper not connected" return "Helper"
def data_connected_to_helper(self, ctx, data): try: uploader = self.client.getServiceNamed("uploader") except KeyError: return "no" # we don't even have an Uploader furl, connected = uploader.get_helper_info()
if furl is None: return "not-configured" if connected: return "yes" return "no" |
These functions are accesed from the template welcome page which gets rendered by nevow:
File: allmydata/web/welcome.xhtml
(…)
<div> <h3> <div><n:attr name="class">status-indicator connected-<n:invisible n:render="string" n:data="connected_to_helper" /></n:attr></div> <div n:render="string" n:data="helper_description" /> </h3> <div class="furl" n:render="string" n:data="helper_furl_prefix" /> </div>
(…) |
Tests are implemented in allmydata/test/test_helper.py
File: allmydata/test/test_helper.py class AssistedUpload(unittest.TestCase): (...) def setUpHelper(self, basedir, helper_class=Helper_fake_upload): fileutil.make_dirs(basedir) self.helper = h = helper_class(basedir, self.s.storage_broker, self.s.secret_holder, None, None) self.helper_furl = self.tub.registerReference(h)
def test_one(self): self.basedir = "helper/AssistedUpload/test_one" self.setUpHelper(self.basedir) u = upload.Uploader(self.helper_furl) u.setServiceParent(self.s)
d = wait_a_few_turns()
def _ready(res): assert u._helper
return upload_data(u, DATA, convergence="some convergence string") d.addCallback(_ready) (…)
def test_previous_upload_failed(self):
(...) f = open(encfile, "wb") f.write(encryptor.process(DATA)) f.close()
u = upload.Uploader(self.helper_furl) u.setServiceParent(self.s)
d = wait_a_few_turns()
def _ready(res): assert u._helper return upload_data(u, DATA, convergence="test convergence string") d.addCallback(_ready)
(…)
def test_already_uploaded(self): self.basedir = "helper/AssistedUpload/test_already_uploaded" self.setUpHelper(self.basedir, helper_class=Helper_already_uploaded) u = upload.Uploader(self.helper_furl) u.setServiceParent(self.s)
d = wait_a_few_turns()
|
File: allmydata/client.py
Add MULTI_HELPERS_CFG var with the path to helpers file
Create a init_upload_helpers_list to parse the file and return the list of furls (also must take into account helper.furl in tahoe.cfg for compatibility options).
Update init_client.py to call init_upload_helpers_list. Refactor code to read and write from the multiple introducers list to get a generic 'list of furls' manager that can be shared by multiple introducers and the multiple helpers initialization code. This refactoring will also be useful for feature number 3, spreading servers, given that both lists will be updated with a similar mechanism.
Eventually rename init_helper to init_helper_server.
File: allmydata/immutable/upload.py
Refactor Uploader:
Create a wrapper class to handle connections with remote helper servers using functions _got_helper, _got_versioned_helper, _lost_helper, get_helper_info from Upload class.
Create a list of available helpers from the helpers list passed during initialization.
Create a hook function to select which server to use for uploading:
Choose the best helper server to upload based on the availability of helper servers and their statistics.
Fallback to standard broker if no helper is available.
File: allmydata/web/root.py / allmydata/web/welcome.xhtml
Modify functions Root.data_helper_furl_prefix, Root.data_helper_description and Root.data_connected_to_helper and the nevow template to accommodate to a list of helpers instead of a single helper available. (See patch for Tahoe-LAFS issue #1010 for those two files)
File: allmydata/test/test_helper.py
Add several fake fake uploaders to the file, verify that the selection works fine according to (fake) server statistics.
New file: allmydata/test/test_multi_helpers.py
New test file to check that the client parses properly the list of multiple helpers and that the Uploader is also properly initialized. (see allmydata/test/test_multi_introducers.py for reference).
Described the changes implemented in the following files:
docs/architecture.rst.
docs/configuration.rst.
docs/helper.rst.
Patches for similar functionality have already been published into Tahoe-LAFS repository. They can be used as a guide for implementation details:
Support of multiple introducers: provides a sample of how to move from a single introducer to a list of introducers6 7.
Hook in server selection when choosing a remote StorageServer: sample of how we can implement a programmable hook to choose the target server in a generic way8.
Version 1.10 of Tahoe-LAFS allows to specify a list of multiple introducers. However, this list is static, specified per installation in the BASEDIR/introducers file (thanks to the multiintroducers-path used in i2p-Tahoe-LAFS), given that the introducer only publishes a list of available StorageServers and not of available Introducers. This also applies for the list of Helpers once the multi-helpers modification be implemented.
Proposed feature consists of:
a) publishing a list of known Introducers that will be used to update the StorageClient's list of introducers.
b) publish a list of known helpers that will be used to update the StorageClient's list of helpers.
Configuration in tahoe.cfg will be used to indicate that:
In StorageClients:
If we want or not to get the list of Introducers to be updated automatically.
If we want or not to get the list of Helpers to be updated automatically.
In Helper nodes:
If we want the furl of the Helper node to be published via the introducer.
In Introducer nodes:
If we want the list of alternative introducers at BASEDIR/introducersfurl to be published.
We will use existing Introducer infrastructure to publish the furls of Helpers and Introducers.
Required functionality:
A StorageClient can subscribe to notifications of 'introducer' and 'helper' services, in addition to the 'storage' service to which it subscribes now.
The StorageClient will update the BASEDIR/helpers or BASEDIR/introducers file according to the data received from the Introducer.
A Helper can publish its furl via an Introducer, which will distribute it to connected StorageClients.
An Introducer can publish a list of alternative Introducers to the StorageClients that are connected to it. The list distributed is that stored in the BASEDIR/introducers file.
We analyse functionality related to the modifications listed above:
The initialization of the introducers list from the configuration file
The connection of the StorageClient to the IntroducerServer (using its IntroducerClient), and how it publishes its furl and subscribes to receive the furls of other StorageServers.
The initialization of a Helper server.
The initialization of an Introducer server.
Below we can find the code that initializes the list of introducers in the allmydata/client.py:
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin):
(…)
def __init__(self, basedir="."): node.Node.__init__(self, basedir) self.started_timestamp = time.time() self.logSource="Client" self.encoding_params = self.DEFAULT_ENCODING_PARAMETERS.copy() self.init_introducer_clients() self.init_stats_provider() self.init_secrets() self.init_node_key() self.init_storage()
(…)
def init_introducer_clients(self): self.introducer_furls = [] self.warn_flag = False # Try to load ""BASEDIR/introducers" cfg file cfg = os.path.join(self.basedir, MULTI_INTRODUCERS_CFG) if os.path.exists(cfg): f = open(cfg, 'r') for introducer_furl in f.read().split('\n'): introducers_furl = introducer_furl.strip() if introducers_furl.startswith('#') or not introducers_furl: continue self.introducer_furls.append(introducer_furl) f.close() furl_count = len(self.introducer_furls) #print "@icfg: furls: %d" %furl_count
# read furl from tahoe.cfg ifurl = self.get_config("client", "introducer.furl", None) if ifurl and ifurl not in self.introducer_furls: self.introducer_furls.append(ifurl) f = open(cfg, 'a') f.write(ifurl) f.write('\n') f.close() if furl_count > 1: self.warn_flag = True self.log("introducers config file modified.") print "Warning! introducers config file modified."
# create a pool of introducer_clients self.introducer_clients = []
|
The first block highlighted in init_introducer_clients tries to read the BASEDIR/introducers file, the second adds helper.furl from tahoe.cfg if it was not contained in BASEDIR/introducers.
The second functionality that we are interested in using is the existing introducer infrastructure to update the lists of Introducers and Helpers. Below we find the relevant code used to subscribe the StorageFarmBroker (responsible of keeping in touch with the StorageServers in the grid) to the Introducer's 'storage' announcements (as an example of how we will have to publish the corresponding “helper” and “introducer” announcements):
File: allmydata/storage_client.py
class StorageFarmBroker: implements(IStorageBroker) """I live on the client, and know about storage servers. For each server that is participating in a grid, I either maintain a connection to it or remember enough information to establish a connection to it on demand. I'm also responsible for subscribing to the IntroducerClient to find out about new servers as they are announced by the Introducer. """ (...)
def use_introducer(self, introducer_client): self.introducer_client = ic = introducer_client ic.subscribe_to("storage", self._got_announcement)
def _got_announcement(self, key_s, ann): if key_s is not None: precondition(isinstance(key_s, str), key_s) precondition(key_s.startswith("v0-"), key_s) assert ann["service-name"] == "storage" s = NativeStorageServer(key_s, ann) serverid = s.get_serverid() old = self.servers.get(serverid) if old: if old.get_announcement() == ann: return # duplicate # replacement del self.servers[serverid] old.stop_connecting() # now we forget about them and start using the new one self.servers[serverid] = s s.start_connecting(self.tub, self._trigger_connections) # the descriptor will manage their own Reconnector, and each time we # need servers, we'll ask them if they're connected or not.
def _trigger_connections(self): # when one connection is established, reset the timers on all others, # to trigger a reconnection attempt in one second. This is intended # to accelerate server connections when we've been offline for a # while. The goal is to avoid hanging out for a long time with # connections to only a subset of the servers, which would increase # the chances that we'll put shares in weird places (and not update (...) |
Function StorageFarmBroker.use_introducer subscribes to the 'storage' announcements with callback StorageFarmBroker._got_announcement, which tries to establish a connection with the new server whenever it receives the announcement.
During the StorageServer initialization, the announcement that this server is active is published when the connection with the introducer is ready (with the call to ic.publish):
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin): implements(IStatsProducer)
(…)
def init_storage(self): # should we run a storage server (and publish it for others to use)? if not self.get_config("storage", "enabled", True, boolean=True): return readonly = self.get_config("storage", "readonly", False, boolean=True)
storedir = os.path.join(self.basedir, self.STOREDIR)
(..) ss = StorageServer(storedir, self.nodeid, reserved_space=reserved, discard_storage=discard, readonly_storage=readonly, stats_provider=self.stats_provider, expiration_enabled=expire, expiration_mode=mode, expiration_override_lease_duration=o_l_d, expiration_cutoff_date=cutoff_date, expiration_sharetypes=expiration_sharetypes) self.add_service(ss)
d = self.when_tub_ready() # we can't do registerReference until the Tub is ready def _publish(res): furl_file = os.path.join(self.basedir, "private", "storage.furl").encode(get_filesystem_encoding()) furl = self.tub.registerReference(ss, furlFile=furl_file) ann = {"anonymous-storage-FURL": furl, "permutation-seed-base32": self._init_permutation_seed(ss), }
current_seqnum, current_nonce = self._sequencer()
for ic in self.introducer_clients: ic.publish("storage", ann, current_seqnum, current_nonce, self._node_key)
d.addCallback(_publish) d.addErrback(log.err, facility="tahoe.init", level=log.BAD, umid="aLGBKw") |
To publish the address of a Helper node, we will have to do it after its creation and registration in Client.init_helper (which is the function that initializes the Helper server):
File: allmydata/client.py
class Client(node.Node, pollmixin.PollMixin): implements(IStatsProducer)
(…)
def init_helper(self): d = self.when_tub_ready() def _publish(self): self.helper = Helper(os.path.join(self.basedir, "helper"), self.storage_broker, self._secret_holder, self.stats_provider, self.history) # TODO: this is confusing. BASEDIR/private/helper.furl is created # by the helper. BASEDIR/helper.furl is consumed by the client # who wants to use the helper. I like having the filename be the # same, since that makes 'cp' work smoothly, but the difference # between config inputs and generated outputs is hard to see. helper_furlfile = os.path.join(self.basedir, "private", "helper.furl").encode(get_filesystem_encoding()) self.tub.registerReference(self.helper, furlFile=helper_furlfile) d.addCallback(_publish) d.addErrback(log.err, facility="tahoe.init", level=log.BAD, umid="K0mW5w") |
A parameter in the config file for the helper server will tell wether or not to we should publish the helper's address via the introducer.
Regarding the publication of the updated list of Introducers, an IntroducerServer is not connected to another Introducer; however, it can publish a list of introducers which is initially preloaded at BASEDIR/introducers (same file that would be used by a standard node). We will only have to the code for initialization of the Introducer at allmydata/introducer/server.py, parse the introducers file and publish their announcements with a call to IntroducerNode.publish. (Notice that highlighted _publish function means 'publish this furl to the corresponding tub', i.e. make this furl accesible from the outside. From there we have to issue a call to IntroducerService to publish corresponding information. We may have to connect to every introducer on the list to verify they are on and recover additional information about them.
File: allmydata/introducer/server.py
class IntroducerNode(node.Node): PORTNUMFILE = "introducer.port" NODETYPE = "introducer" GENERATED_FILES = ['introducer.furl']
def __init__(self, basedir="."): node.Node.__init__(self, basedir) self.read_config() self.init_introducer() webport = self.get_config("node", "web.port", None) if webport: self.init_web(webport) # strports string
def init_introducer(self): introducerservice = IntroducerService(self.basedir) self.add_service(introducerservice)
old_public_fn = os.path.join(self.basedir, "introducer.furl").encode(get_filesystem_encoding()) private_fn = os.path.join(self.basedir, "private", "introducer.furl").encode(get_filesystem_encoding())
(…) d = self.when_tub_ready() def _publish(res): furl = self.tub.registerReference(introducerservice, furlFile=private_fn) self.log(" introducer is at %s" % furl, umid="qF2L9A") self.introducer_url = furl # for tests d.addCallback(_publish) d.addErrback(log.err, facility="tahoe.init", level=log.BAD, umid="UaNs9A")
(…)
class IntroducerService(service.MultiService, Referenceable): implements(RIIntroducerPublisherAndSubscriberService_v2)
(…)
def publish(self, ann_t, canary, lp): try: self._publish(ann_t, canary, lp) except: log.err(format="Introducer.remote_publish failed on %(ann)s", ann=ann_t, level=log.UNUSUAL, parent=lp, umid="620rWA") raise
(…) |
File: allmydata/client.py
StorageClient:
Subscribe to the Introducer's 'helper' and 'introducer' announcements, possibly within a new Client.init_subscriptions function.
Create the callback function to handle each of both suscriptions and update BASEDIR/helpers and BASEDIR/inroducers accordingly.
HelperServer
After initialization of the server on Client.init_helper, publish the corresponding furl in the introducer with a 'helper' announcement
File: allmydata/introducer/server.py
IntroducerServer
During initialization, read the list of alternative Introducers from BASEDIR/inroducers.
Once the IntroducerService is active, publish the furl of every alternative introducer known to this Introducer instance.
No modifications are needed in the Gui.
File: allmydata/test/test_introducer.py
Class Client: add test cases to verify:
That the client processes properly the new 'helper' and 'introducer' announcements.
That the client updates BASEDIR/helpers and BASEDIR/introducers properly.
That the introducer publishes the alternative list of Introducers according to configuration in tahoe.cfg.
That when a client is configured as HelperServer it publishes its furl via the introducer according to configuration in tahoe.cfg.
Described the changes implemented in the following files:
docs/architecture.rst: add reference to automatic update of DIRBASE/introducers and DIRBASE/helpers
docs/configuration.rst: describe new options for StorageClients (auto_update_introducers, auto_update_helpers), for HelperServer (publish_helper_furl) and for IntroducerServer (publish_alternative_introducers)
docs/helper.rst: describe new configuration options.
CrashPlan & Symform (FileSystem) | I2P + Tahoe-LAFS |
---|---|
Distributed decentralized data | X |
Encrypted before transmitting | X |
No file size limits | X |
Manage password & encryption keys | |
Pause backups on low battery | |
Pause backups over selected network interfaces | |
Pause backups over selected wi-fi networks | |
Sync on a inactivity period – configurable | bash scripting |
Do not produce bandwidth bottlenecks | |
Connection through Proxy | |
Not enumerating IP | X |
Resilence | X |
Storage Balancing | X |
Sumarized volume | |
Anonymous | X |
Sybil Attack protection | |
User Disk Quota |
Selected social network it's friendica. It's a federated service.
Main highlight feature it's possibility to import and export data, posts and likes from other social networks, such as Facebook, Diaspora, Twitter, StatusNet, pump.io, weblogs and RSS feeds - and even email.
It provides a unique centralized point of interacion with each your different profiles on social networks.
Importer Content Filter
When connectors are importing data from other social network, it's possible to configure what data it's imported and what not.
It's a import based on image content, post content and who is posting this information
For example, you can configure to not import «Cat's photos», or «military texts»
It's well known, too, that Content Filters can give false positives and false negatives. We assume will be even better in each release.
Content Indexer
Friendica network index posts, images, tags, and users.
On main gui you have a search for friends, and a search for content. Both can be integrated, but user can choose which search is performing.
This index never leaves your computer. Diferent users have diferent Content Index Database.
GUI
Friendica GUI it's redesigned again to be a 2015 design, fully responsive, and HTML5+CSS3
And a content search box it's embedded.
Friendica Bug Fixing
As all softwares, Friendica has some error sreported on bugtracker to be solved, too. All this list it's included on development Friendica Appendix 3.3
TODO: tor + certificates + federation
Diaspora | Friendica | |
---|---|---|
OpenID Login | ||
Search for people | x | x |
search for places | ||
search for things | x | x |
update status | x | x |
add photos | x | x |
add video | x | x |
add friends | x | x |
add links | x | x |
add advertisements | ||
send messages | x | x |
multi conversations | x | |
video conversations | ||
mute conversation | ||
change a name of multi conversation | ||
be online/offline | ||
see if sb uses facebook on phone or computer | ||
block chat for specific groups/people | ||
turn off chat for specific groups/people | ||
turn on chat only for specific groups/people | ||
use chat | ||
use of emoticons | ||
use stickers | > | |
send links/photos/videos in conversation | x | x |
word-searcher in full conversation | ||
archivate conversation | ||
delate message/conversation | ||
report as spam or abusement | ||
mark as read/unread | ||
massager shows the hour of sending the message | x | |
create pages | x | |
create poll | x | |
create ads | ||
like things | x | x |
comment things | x | x |
share things | x | x |
pokes | ||
edit posts | x | x |
edit status | x | x |
watch activities | x | x |
news feed | x | |
play games | ||
create events | x | |
edit profile of the event | x | |
option: participate/maybe/decline in events | ||
shows weather forecast for the day of the event | ||
invite your friends to event | x | |
remove yourself from guest list | ||
export event | x | |
create groups | x | |
manage your group | ||
pin posts | ||
private/ open/ closed group | ||
join groups | ||
leave groups | ||
stop notification | x | |
add photos | ||
add members | ||
add files | ||
add events | ||
ask questions | ||
change administrator | ||
report group | ||
follow/unfollow friends | x | |
follow/unfollow posts | x | |
tagg people in photos | x | |
tagg people in posts/ status/ | ||
add description for picture | x | |
edit/add profile picture | x | x |
add/change cover photo | ||
update personal information: | x | x |
· Work and Education | ||
· Relationship | ||
· Family | ||
· Places Lived | x | |
· Basic Information | x | |
· Contact Information | ||
· Life Events | ||
· Interests | x | x |
manage sections | ||
create albums | ||
add friends | x | x |
unfriend | x | x |
suggest friends to other person | x | |
“divine friends into groups f.e. close friends, acquaintances” |
x | x |
activity log | x | x |
change general account settings | x | x |
edit security settings | x | |
extra protection for people under 18 | ||
privacy settings concerning added stuff by you | x | x |
restrictions about who can contact you | x | |
restrictions about looking up | ||
blocking apps/ games/ advertisements/ events/ users | only users | only users |
“possibility of choosing the way of getting notifications (e-mail, messages, on facebook)“ |
E-mail only | E-mail only |
decide who can follow you | ||
payment settings | ||
aplication for mobile phone | x | x |
help service | x | x |
report problems | x | x |
users can translate network to other languages | ||
translations are aproved by users in vote | ||
message sending with pressing enter | ||
can connect with other networks | x | x |
TIMELINE - All sorted By Priority | Development Hours | 161,25h | Cost | 6450€ |
First Month(all TODO) | 161,25h | 6450€ |
---|---|---|
PHP Fatal error accessing profile pages with a lot of posts | 1,25h | 50€ |
Navigating to index page with HTTPS forced does not redirect to HTTPS. | 3,75h | 150€ |
poller.php error | 1,25h | 50€ |
Impossible to make an introduction | 2,5h | 100€ |
button breaks the theme | 1,25h | 50€ |
private message is not visible | 1,25h | 50€ |
Same id for original status and retweeted status. | 2,5h | 100€ |
Spaces are Being Removed from Photo URLs | 1,25h | 50€ |
Do prevent stream from jumping around when new posts arrive | 1,25h | 50€ |
Browser UserAgentString for WebOS missing | 3,75h | 150€ |
Infinite duplicate posts in Facebook | 1,25h | 50€ |
posts to other people's walls can't be edited | 2,5h | 100€ |
openid failure with a server that has multiple openid-s | 1,25h | 50€ |
Feature Request: A Home-Button | 1,25h | 50€ |
Feature Request: PGP Clearsigning Beautification | 3,75h | 150€ |
Scheduled Posts | 5h | 200€ |
Image upload in comments impossible | 2,5h | 100€ |
Improve emoticons | 2,5h | 100€ |
Posting a new comment shows a (1) counter at the home menu item. | 2,5h | 100€ |
EveryAuth Login Integration (www.everyauth.com) | 2,5h | 100€ |
XMPP/Jabber integration (www.conversejs.org) | 1,25h | 50€ |
- option: participate/maybe/decline in events | 2,5h | 100€ |
- remove yourself from event guest list | 1,25h | 50€ |
follow/unfollow friends | 1,25h | 50€ |
follow/unfollow posts | 1,25h | 50€ |
tag people in posts/ status/ | 2,5h | 100€ |
add/change cover photo | 1,25h | 50€ |
Add Profile Information | 3,75h | 150€ |
· Work and Education | ||
· Relationship | ||
· Family | ||
· Places Lived | ||
· Basic Information | ||
· Contact Information | ||
· Life Events | ||
create photo albums | 2,5h | 100€ |
extra protection for people under 18 | 2,5h | 100€ |
Allow if you want to be searched | 1,25h | 50€ |
“possibility of choosing the way of getting notifications (e-mail, messages, on facebook)“Now it's email only | 2,5h | 100€ |
decide who can follow you | 2,5h | 100€ |
Friendly UI redesign:Wireframe redesign & layout front end | 40h | 1600€ |
Friendly UI redesign:Development front end CSS3 | 40h | 1600€ |
Friendly UI redesign:Develop connections on front end | 10h | 400€ |
The best search engine it's YaCy. It's developed in Java, with a distributed database, shared when users are doing a petition.
In this way not each computer has to have all internet crawled content.
User can access YaCy search engine from a regular URL, or integrated in OwnCloud
Webcrawler
YaCy it's modified to be able to index in same time: intranet, extranet, DarkNets (I2P/TOR) and internal plugins.
File indexing some will be shared with other YaCy nodes and other not.
- Extranet (regular internet) results are shared with all YaCy nodes
- Intranet results are shared with other YaCy nodes on the same intranet.
- DarkNet results are shared with other DarkNet YaCy installations.
- Internal plugins aren't shared with nobody.
Search page
When user it's on YaCy search page, he can select which result want to get. By default all are checked: internal, external, darknets and internal plugins.
WebCrawler Internal Plugin: OwnCloud
This YaCy Plugin it's used to wrap OwnCloud already made indexation (read OwnCloud File indexation section).
YaCy can show OwnCloud files and its content, redirecting to OwnCloud installation.
WebCrawler Internal Plugin: Friendica
This YaCy Plugin it's used to wrap Friendica already made indexation (read Friendica Content indexation section).
YaCy can show Friendica people, posts and images, redirecting to Friendica local installation.
WebCrawler Internal Plugin: Emails
This YaCy Plugin it's used to index emails on system.
YaCy index mail title, sender, recipients and attachments if they aren't encrypted. In this case, will be indexed what it can.
If email it's GPG encrypted, also it will index what it can.
Search Improvement
YaCy's search results needs to be improved, to be fully competitive in a Google environment.
OwnCloud Integration
On OwnCloud main gui, there's a YaCy search box, like Google it's integrated with gmail.
It uses JSON query URL to get direct YaCy results and show it inside OwnCloud.
Queries YaCy JSON API URL it's (http://localhost:8090/yacysearch.json?query=microsoft )
Then show it on the frontend.
YaCy Bug Fixing
As all softwares, YaCy has some error sreported on bugtracker to be solved, too. All this list it's included on development YaCy Appendix 3.4
Google & Bing & Yahoo | YaCy |
---|---|
Competitive Search Results (if you search a word as “kademar”, it should appear in first place website www.kademar.org) & (appear in first place important links inside this first website) & (improve website search result order by website relevancy) & (improve search results by search sentence) | |
Search fory: | |
- text | X |
- images | X |
- videos | |
- shopping | |
- maps | |
- news | |
- books | |
- flights | |
- apps | |
- celebrity | |
The ability to control keys | X |
Related search (in the bottom) | X (keywords in sidebar, sentences in results) |
“Language autodetection based on browser language or address, for example when I write google.co.uk, displays the inscriptions ”“This site is avaiable in English”““ | |
Change language by selecting your localisation in settings | Deustch, you can change this in preferences. Five languages are available |
Case unsensitive | |
Search pages from the world or just the selected language (browser / chosen language) | |
Search operators: | |
- Search in title (intitle:) | X |
- Search in url (inurl:) | X |
- Search info (info:) | |
- Search cache (cache:) | |
- Search for a number range (eg. camera $50..$100) | |
- Search for either word (eg. world cup location 2014 OR 2018) | |
”- Fill in the blank (eg. ”“a * saved is a * earned”“)“ | |
- Search for pages that are similar to a URL (eg. related:time.com) | |
- Search for pages that link to a URL (eg. link:google.com) | |
- Search within a site or domain (eg. olympics site:nbc.com ) | Search images: |
”- Exclude a word (eg. ”“jaguar speed -car”“ or ”“pandas -site:wikipedia.org”“)“ | |
”- Search for an exact word or phrase (eg. ”“imagine all the people”“)“ | |
- inlink: | |
- author: | |
- tld: | |
- /ftp | |
- /http | |
- /date | |
- /near | |
- /smb | |
- /file | |
Stop-list, that is words are not taken into account (a, the, on, at, in, and, of, punctation) | |
Apart from standard files html/htm, php/php3, xhtml, asp and indexes other types like: txt, ans, pdf, ps, doc, xls, ppt, wks, wps, wdb, wri, rtf, swf, wk1, wk2, wk3, wk4, wk5, wki, wks, wku, lwp, mw | |
”- search for filetypes (eg. ”“filetype:odt”“)“ | |
Spelling dictionary | ? |
Extras: | |
- calculator | |
- unit converter | |
- currency converter | |
- definitions | |
- map | |
Search by voice | |
Search tools for text: | Search tools for text: |
- by language | |
- by country | |
- by date | |
- by search near | |
Onscreen keyboard | |
- by type of file | |
- by type of server | |
- by url | |
Safe Search filter | |
Dynamic search (dynamic display of results when typing) | X |
Localization by geolocalization (IP) | |
“Graf knowledge: When you type ”“Torun”” displays information about the city, sometimes weather (it seems to me that resigned from this option).” | |
In search results for the most important results are displayed: | |
- link to page with results for pictures | |
- link to page with results for videos | |
- link to page with results for news | |
- releated (eg. People) | |
- for sportsmen changing background (mundial) | |
Search engine designed for mobile devices | |
“Notification ”“Looking for results in English?”“ (English for example)“ | |
“Enter an expression incorrectly, it shall be found in the correct form (in addition displays the words ”“Displays results for: sommething””, ”“Instead, search for: something”“)“ | |
The ability to remove information from google (but very hard to do) | |
Search images: | |
- by color | |
- by size | |
- by type | |
- ad management system (google adsense / adwords) | |
- by usage rights | |
- by images | |
- by person | |
- by structure | |
- other: Top gallery | |
- by type of file | |
- by type of server | |
- by url | |
Search history | |
Autocomplete | |
Personalize Search Screen | |
Webcrawler for internet y external |
TIMELINE - All sorted By Priority | Development Hours | 626h | Cost* 15€/h based on a 2400€ salary | 23.475€ |
---|
First Month(bug month) | 272h. | 10.200 € |
---|---|---|
* Performance Issues: http://mantis.tokeek.de/view.php?id=305 | 32h. | 1200€ |
* “Too many open files” while searching&crawling: http://mantis.tokeek.de/view.php?id=406 | 16h. | 600€ |
* Unable to list Process Scheduler: http://mantis.tokeek.de/view.php?id=290 | 8h. | 300€ |
* Yacy does not start: http://mantis.tokeek.de/view.php?id=420 | 24h. | 900€ |
* index 100% CPU: http://mantis.tokeek.de/view.php?id=81 | 24h. | 900€ |
* improve YaCy Web UI: http://mantis.tokeek.de/view.php?id=151 | 16h. | 600€ |
* CPU cicles: http://mantis.tokeek.de/view.php?id=418 | 24h. | 900€ |
* Huge Ram Eater: http://mantis.tokeek.de/view.php?id=282 | 32h. | 1200€ |
* Young mode and DHT issue: http://mantis.tokeek.de/view.php?id=150 | 24h. | 900€ |
* SSL Init Fail: http://mantis.tokeek.de/view.php?id=251 | 16h. | 600€ |
* Infinite crash after one “not enough free space”: http://mantis.tokeek.de/view.php?id=144 | 24h. | 900€ |
* YaCy cant boot anymore after setting up SSL: http://mantis.tokeek.de/view.php?id=323 | 8h. | 300€ |
* Improve search algorithm: http://mantis.tokeek.de/view.php?id=283 | 16h. | 600€ |
* Search engine designed for mobile devices (responsive) | 18h. | 675€ |
Second Month | 246h | 9.225 € |
---|---|---|
* out of memory on big index: http://mantis.tokeek.de/view.php?id=376 | 32h. | 1200€ |
* Search pages from the world or just the selected language (browser / chosen language) | 8h. | 300€ |
* Apart from standard files html/htm, php/php3, xhtml, asp and indexes other types like: txt, ans, pdf, ps, doc, xls, ppt, wks, wps, wdb, wri, rtf, swf, wk1, wk2, wk3, wk4, wk5, wki, wks, wku, lwp, mw | 24h. | 900€ |
* Incrase search frequency: http://mantis.tokeek.de/view.php?id=419 | 16h. | 600€ |
* Stop-list, that is words are not taken into account (a, the, on, at, in, and, of, punctation) | 4h. | 150€ |
* Bandwidth limitator: http://mantis.tokeek.de/view.php?id=165 | 24h. | 900€ |
* Network Autoclean old entries: http://mantis.tokeek.de/view.php?id=20 | 16h. | 600€ |
* Change YaCy process priority: http://mantis.tokeek.de/view.php?id=73 | 16h. | 600€ |
* Case unsensitive | 4h. | 150€ |
* Search pages from the world or just the selected language (browser / chosen language) | 8h. | 300€ |
* Search for pages that link to a URL (eg. link:google.com) | 8h. | 300€ |
* Search within a site or domain (eg. olympics site:nbc.com ) | 8h. | 300€ |
* Search for an exact word or phrase (eg. “imagine all the people”) | 4h. | 150€ |
* search for filetypes (eg. “filetype:odt”) | 8h. | 300€ |
* Import Open StreetMap data in YaCy: http://mantis.tokeek.de/view.php?id=226 | 32h. | 1200€ |
* Personalize SearchEngine Screen | 18h. | 675€ |
* Onscreen keyboard | 16h. | 600€ |
Third Month | 100h | 3750 € |
---|---|---|
* Graf knowledge: When you type “Torun” displays information about the city, sometimes weather). | 40h. | 1500€ |
* Competitive Search Results (if you search a word as “kademar”, it should appear in first place website www.kademar.org) & (appear in first place important links inside this first website) & (improve website search result order by website relevancy) & (improve search results by search sentence) | 60h. | 2250€ |
OwnCloud already has ODT (text) editing based in webodf.org tech.
When OwnCloud recibes End-2-end javascript encryption «secure mode», it's needed to change way that Collaboration suite it's making connections, to make possible online editing in a end-2-end encrypted scenario.
New Network Connection Model: 1 user
When user enter to OwnCloud to edit a document.
- Browser loads document editor JS
- Loads file document. And OwnCloud keeps information that he is editing this document (master connection).
- Master connection it's only who can save document to owncloud.
- It detects file it's encrypted. Ask for document password, to decrypt it,
- Document editing suite and document it's loaded on user's RAM. When document it's saved, it's encrypted using RAM's memory password, and send it again to server.
- On close, OwnCloud removes information that somebody it's editing this document.
New Network Connection Model: multiple users – Realtime P2P
User 2 can enter to edit same file by using a direct link or using same OwnCloud GUI. (slave connection)
When interface it's loading, OwnCloud send a notification to user1. In this moment, User1 owncloud gui saves the document, and lock the interface, while it's entering new member.
User 2 downloads current document version, and his OwnCloud asks for decrypting password.
When user 2 it's connected user1 interface gets unlocked
Master connection and slaves connections talks Peer-to-Peer without middle nodes.
Each modification, pointer position and changes, are shared between users directly.
They see changes simultaneously, but only master connection writes changes to owncloud.
When user1 (master connection) disconnects, give to user 2 master connection mark. Then user 2 writes the document.
If user1 again or user 3 connects, master connection remains on user 2
Support more files
We contribute to WebODF to create ODS (spreadsheet) and ODP (presentation) viewers, and then editors.
We contribute with WebODF to add more features to their ODT document editor.
Then OwnCloud editing suite will have a full working office suite.
Conferencing solution it's XMPP + OTR (encryption)+ WebRTC (video)
There's a OwnCloud plugin already developed that do this. https://apps.owncloud.com/content/show.php/JavaScript+XMPP+Chat?content=162257
It uses ejabberd xmpp server
We will compile ejabbberd with TOR support (https://spaceboyz.net/~astro/ejabberd-2.0.x+tor.patch)
We provide a set of XMPP «grey» communitycube servers. Those servers has Prosody with mod_onions, which allow them to connet through TOR or regular servers, creating bridges between 2 networks.
When you are using OwnCloud + XMPP user can choose to use our jabber grey communitycube servers, or a regular server like jabber.ccc.de
User also could connect by regular Jabber client
Encryption
XMPP communications Encryption it's handled by OTR XMPP protocol extension
Video conference will be handled by WebRTC.
WebRTC needs a STUN/TURN server to broker connections. we will create some like webrtc.communitycube.net
Proposal tech
http://www.html5rocks.com/en/tutorials/webrtc/basics/
http://www.html5rocks.com/en/tutorials/webrtc/infrastructure/
Broker connection server we PeerJS Server (WebRTC connection broker),
Thank you for your awesome support!